PCX 



WORLD INTCLLECTUAL PROPERTY ORGANIZATION 
Intemadooal Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT^ 



(51) International Patent Classification ^ 
C12N 5/DO, 15/00, C07H 21700 



Al 



(11) International PubUcatlon Number: WO 96/11260 

(43) IntemationaJ Publication Date: 18 April 1996 (18.04.96) 



(21) International Application Number: PCT/US95/ 13233 

(22) Intemationai Filing Date: 6 October 1995 (06.10.95) 



(30) Priority Data: 

08/319745 



7 October 1994 (07.10.94) 



US 



(71) Applicant: THE BOARD OF TRUSTEES OF THE LELAND 

STANFORD JUNIOR UNIVERSITY [USAJSJ; Stanfoni 
University. Sranfond, CA 94305 (US). 

(72) Inventors: SCOTT, Matthew, P.; 914 Wing Place, Stanford, 
CA 94305 (US). GOODRICH, Lisa, V.; 66 Newell Road, 
Palo Alto. CA 94303 (US). JOHNSON, Ronald. L.; 
Apartment 7, 1528 Hudson Street, Redwood City, CA 94061 
(US). 

(74) Agents: ROWLAND. Bertram, L et al.; Flchr, Hohbach, Test. 
Albritton & Herbert. Suite 3400. 4 Embarcadcio Center, San 
Francisco, CA 94111-4187 (US). 



(81) Designated States: AU, CA, JP, European patent (AT BE 
CH, DE. DK, ES, FR, GB, GR. IE, IT. LU, MC, NL. PT 
SE). 



Published 

With international search report. 



(54) Titie: PATCHED GENES AND THEIR USE ~ ' 

(57) Abstract 

- 1 ..^""^J^^^^^ vertebrate patched genes are provided, including die mouse and human patched genes as well as methods for 
^^'L? w ^ ^"f.'' "^'^ t "f"^. "^^ species of in the same family. HavbTg the abm^to reSle^^ 

l^^^^/rr^ri "^ir'-^*^ elucidation of embryonic development, cellular regulation associated with signal mmsducti^^^^^ 
^S^hS the identificadon of agomst and antagonist to signal transduction, identification of Ugands for bindbg^to patcZd%ZZ 
of the hgands, and assaymg for levels of transcripdOT and expression of the pote/M^^ 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international 
applications under the PCT. 



AT 


Austria 


GB 


United Kingdom 


MR 


Mauritania 


AU 


Ausoralia 


G£ 


Geocpa 


MW 


Malawi 


BB 


BvtuKlo* 


GN 


Gniaea 


NE 


Niger 


BE 


Belgium 


GR 


Greece 


^a. 


Nctherbnds 


BF 


Buitiu Faio 


HU 


Hungary 


NO 


Norway 


BG 


BolgariA 


IE 


Ireland 


NZ 


New Zealand 


Bi 


Benin 


IT 


Italy 


PL 


Ptiland 


BR 


Bfizi] 


JP 


Japan 


PT 


Portugal 


BY 


Belams 


K£ 


Kenya 


RO 


Romania 


CA 


Canada 


KG 


Kyrgyuan 


RU 


Rttuian Fedemioo 


CF 


Central Africao Republic 


KF 


Democtaiic People's Republic 


SD 


Sndan 


CG 


Congo 




of Korea 


SE 


Sweden 


CH 


Switzerland 


ICR 


Republic of Korea 


SI 


Slovenia 


a 


Cfite divoire 


KZ 


Kazakhstan 


SK 


Slovakia 


CM 


CamcFOoo 


U 




SN 


Senega) 


CN 


China 


LK 


Sri L«aka 


TD 




CS 


Czechoa^akia 


LU 


Ijsteuibouig 


TG 


Togo 


CZ 


Czech Republic 


LV 


Ljcvia 


TJ 


Tajikiitan 


DE 


Germany 


MC 


Mooaoo 


TT 


THnidad and Tobago 


DK 


Deomazfe 


MD 


Repoblic of Moldova 


UA 


Ukraine 


ES 


Spain 


MG 


MadjtgMCv 


US 


United States of America 


n 


Pinlasd 


ML 


MaU 


uz 


Uzbekistan 


FR 


Prince 


MN 


Mongolia 


VN 


Vice Nam 


GA 


Gabon 











wo 96/11260 



PCT/OS95/13233 



PATCHED GENES AND THEIR USE 



5 



Techniral Pi«>M 



The field of this invention concerns segment polarity genes and their uses. 
10 Backgmiinri 

Segment polarity genes were discovered in flies as mutations which change 
the pattern of structures of the body segments. Mutations in tiie genes cause animals 
to develop the changed patterns on the surfeces of body segments, the changes 
affecting the pattern along the head to tail axis. For example, mutations in tfie gene 

15 patched cause each body segment to develop witfiout the normal structures in Uie 
center of each segment. In their stead is a mirror image of the pattern normally 
found in the anterior s^ent. Thus ceUs in the center of the segment make the 
wrong structiires, and point them in the wrong direction with reference to tiie over 
all head-to-taa polarity of the animal. About sixteen genes in tfie class are known. 

20 The encoded proteins include kinases, transcription factors, a cell junction protein, 
two secreted proteins caUed wingless (WG) and hedgehog (HH), a single 
transmembrane protein caUcd patdted (PTQ, and some novel proteins not related to 
any known protein. AH of these proteins are beU ved to work together in signaling 
patiiways that inform cells about tiieir neighbors in order to set ceU fetes and 

25 polarities. 
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Many of the segment polarity proteins of Drosophila and other invertebrates 
are closely related to vertebrate proteins, implying that the molecular mechanisms 
involved are ancient. Among the vertebrate proteins related to the fly genes are En- 
1 and -2, which act in vertebrate brain development and WNT-1, which is also 

5 involved in brain development, but was first found as the oncogene impUcated in 
many cases of mouse breast cancer. In flies, the patched gene is transcribed into 
RNA in a complex and dynamic pattern in embryos, including fine transverse stripes 
in each body segment primordium. The encoded protein is predicted to contain 
many transmembrane domains. It has no significant similarity to any other known 

10 protein. Other proteins having large numbers of transmembrane domains include a 

variety of membrane receptors, channels through membranes and transporters 

through membranes. 

The hedgehog (HH) protein of flies has been shown to have at least three 
vertebrate relatives: Sonic hedgehog (Shh); Indian hedgehog, and Desert hedgehog. 
15 The Shh is expressed in a group of cells at the posterior of each developing limb 
bud. This is exactly the same group of cells found to have an important role in 
signaling polarity to the developing limb. The signal appears to be graded, with 
ceUs close to the posterior source of the signal forming posterior digits and other 
limb structures and cells farther from the signal source forming more anterior 
20 structures. It has been known for many years that transplantation of the signaling 
cells, a region of the limb bud known as the "zone of polarizing acUvity (ZPA)- has 
dramatic effects on Umb patterning. Implanting a second ZPA anterior to the limb 
bud causes a Umb to develop with posterior features replacing the anterior ones (in 
essence UtUe fingers instead of thumbs). Shh has been found to be the long sought 
25 ZPA signal. Cultured ceUs making Shh protein (SHH). when implanted into the 
anterior Umb bud region, have the same effect as an implanted ZPA. This 
estabUshes that Shh is clearly a critical trigger of posterior limb development. 

The factor in the ZPA has been thought for some time to be related to 
another important developmental signal that polarizes the developing spinal cord. 
30 The notochord, a rod of mesoderm that runs along the dorsal side of early vertebrate 
embryos, is a signal source that polarizes the neural tube along the dorsal-ventral 
axis. The signal causes the part of the neural tube nearest to the notochord to form 
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floor plate, a morphologicaUy distinct part of the neural tube. The floor plate, in 
turn, sends out signals to the more dorsal pans of the neural tube to further 
determine ceU fetes. The ZPA was reported to have the same signaling effect as the 
notochord when transplanted to be adjacent to the neural tube, suggesting the ZPA 
5 makes the same signal as the notochord. In keeping with this view, Shh was found 
to be produced by notochord ceUs and floor plate ceUs. Tests of extra expression of 
Shh in mice led to the finding of extra expression of floor plate genes in cells which 
would not normally turn them on. Therefore Shh appears to be a component of the 
signal from notochord to floor plate and from floor plate to more dorsal parts of the 
10 neural tube. Besides Umb and neural tubes, vertebrate hedgehog genes are also 
expressed in many other tissues including, but not limited to the peripheral nervous 
system, brain, lung, liver, kidney, tooth primordia, genitalia, and hindgut and 
foregut endodam. 

PTC has been proposed as a receptor for HH protein based on genetic 
15 experiments in flies. A model for die relationship is that PTC acts through a largely 
unknown pathway to inactivate both its own transcription and die ttanscription of the 
wingless segment polarity gene. This model proposes that HH protein, secreted 
from adjacent ceUs, binds to Uie PTC receptor, inactivates it, and Uiereby prevents 
PTC from turning off its own transcription or that of wingless. A number of 
20 experiments have shown coordinate events between PTC and HH. 
Relevant T itPrat|,rp 

Descriptions oi patched, by itself or its role with hedgehog may be found in 
Hooper and Scott, Cell 59. 751-765 (1989); Nakano et al., Nahire, 341, 508-513 
(1989) (both of which also describes the sequence for Drvsophila patched) Simcox 

25 et al.. Development 107. 715-722 (1989); Hidalgo and Ingham, Development, 110, 
291-301 (1990); Phillips etal.. Development, 110, 105-114 (1990); Sampedio and 
Guerrero. Nature 353, 187-190 (1991); Ingham et al.. Nature 353, 184-187 (1991); 
and Taylor et al.. Mechanisms of Development 42, 89-96 (1993). Discussions of 
the role of hedgehog include Riddle et al.. Cell 75, 1401-1416 (1993); Echelard et 

30 al., CeU 75, 1417-1430 (1993); Krauss et al.. Cell 75, 1431-1444 (1993); Tabata 
and Komberg. CeU 76. 89-102 (1994); Heemskerk & DiNaid , CeU 76, 449-460 
(1994); Relink et al., Cell 76, 761-775 (1994); and a short review article by 
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Ingham. Current Biology 4, 347-350 (1994). The sequence for the Drosophila 5' 
non-coding region was reported to the G nBank, accession number M28418, 
referred to in Hooper and Scott (1989), supra. See also, Forbes, et al.. 
Development 1993 Supplement 115-124. 

5 

Methods for isolating patched genes, particularly mammalian /^flfctei/ genes, 
including the mouse and human patched genes, as weU as invertebrate patched genes 
and sequences, are provided. The methods include identification of patched genes 
10 from other species, as weU as members of the same family of proteins. The subject 
genes provide methods for producing the patched protein, where the genes and 
proteins may be used as probes for research, diagnosis, binding of hedgehog protein 
for its isolation and purification, gene therapy, as well as other utiUties. 

15 T^ ppp r>F5;rRT FnON DRAWTNGS 

Fig. 1 is a graph having a restriction map of about lOkbp of the 5' region 
upstream from the initiation codon of Drosophila patched gene and bar graphs of 
constructs of truncated portions of the 5' region joined to P-galactosidase. where the 
constructs are introduced into fly ceU lines for the production of embryos. The 

20 expression of P-gal in the embryos is indicated in the right-hand table during early 
and late development of the embryo. The greater the number of +'s, the more 
intense the staining. 

ppgrpiPTTO N Q? ^PT^rmr FMBODTMRNTS 
25 Methods are provided for identifying members of the patched iptc) gene 

family from invertebrate and vertebrate, e.g. mammalian, species, as weU as the 
entire cDNA sequence of the mouse and human patched gene. Also, sequences for 
invertebrate patched genes are provided. T\i^patched gene encodes a 
transmembrane protein having a large number of transmembrane sequences. 
30 In identifying the mouse and patched genes, primers were employed 

to move through the evolutionary tree from the known DrosopMla ptc sequence. 
Two primers are employed from the Drosophila sequence with appropriate 
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restriction enzyme linkers to ampUfy portions of genomic DNA of a related 
invertebrate, such as mosquito. The sequences are selected from regions which are 
not likely to diverge over evolutionary time and are of low degeneracy. 
Convenientiy, the regions are the N-terminal proximal sequence, generally within 
5 the first 1 .5kb, usually within the first Ikb, of the coding portion of the cDNA, 
conveniently in the first hydrophilic loop of the protein. Employing the polymerase 
chain reaction (PGR) with the primers, a band can be obtained from mosquito 
genomic DNA. The band may then be amplified and used in turn as a piobe. One 
may use this probe to probe a cDNA Ubrary from an organism in a different branch 
10 oftheevolutionary tree, such as a butterfly. By screening the Ubrary and 
identifying sequences which hybridize to the probe, a portion of the butterfly 
patched gene may be obtained. One or more of the resulting clones may then be 
used to rescieen the library to obtain an extended sequence, up to and including the 
entire coding region, as well as the non-coding 5'- and 3 '-sequences. As 
15 appropriate, one may sequence all or a portion of the resulting cDNA coding 
sequence. 

One may then screen a genomic or cDNA Ubrary of a species higher in the 
evolutionary scale with appropriate probes from one or both of the prior sequences. 
Of particular interest is screening a genomic library, of a distantly related 

20 invertd)rate, e.g. beetle, where one may use a combination of the sequences 
obtained from the previous two species, in this case, the Dmsophila and the 
butterfly. By appropriate techniques, one may identify specific clones which bind to 
the probes, which may then be screened for cross hybridization with each of the 
probes individually. The resulting fragments may then be ampUfied, e.g. by 

25 subcloning. 

By having all or parts of the 4 different patched genes, in the presently 
illustrated example, Dmsophila (fly), mosquito, butterfly and beetle, one can now 
compare iht patched genes for conserved sequences. Cells from an ^ropriate 
mammalian limb bud or other ceUs expressing patched, such as notochord, neural 
30 tube, gut. lung buds, or other tissue, particularly fetal tissue, may be employed for 
screening. Alternatively, adult tissue which produces patched may be employed for 
screening. Based on the consensus sequence available from the 4 other species, one 
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can develop probes where at each site at least 2 of the sequences have the same 
nucleotide and where the site varies that each species has a unique nucleotide, 
inosine may be used, which binds to all 4 nucleotides. 

Hther PGR may be employed using primers or, if-desired, a genomic library 
5 from an appropriate source may be probed. With PGR, one may use a cDNA 
Ubrary or use reverse transcriptase-PCR (RT-PCR). where mRNA is available from 
the tissue. Usually, where fetal tissue is employed, one will employ tissue from the 
first or second trimester, preferably the latter half of the first trimester or the second 
trimester, depending upon the particular host. The age and source of tissue wUl 
10 depend to a significant degree on the abUity to surgically isolate the tissue based on 
its size, the level of expression of patched in the cells of the tissue, the accessibility 
of the tissue, the number of cells expressing paiched and the like. The amount of 
tissue avaUable should be large enough so as to provide for a sufficient amount of 
mRNA to be usefully transcribed and amplified. With mouse tissue, Umb bud of 
15 from about 10 to 15 dpc (days post conception) may be employed. 

In the primers, the complementary binding sequence wUl usually be at least 
14 nucleotides, preferably at least about 17 nucleotides and usually not more than 
about 30 nucleotides. The primers may also include a restriction enzyme sequence 
for isolation and cloning. With RT-PCR. the mRNA may be enriched in accordance 
20 with known ways, reverse transcribed, foUowed by amplification with the 

appropriate primers. (Procedures employed for molecular cloning may be found in 
Molecular Cloning: A Laboratory Manual, Sambrook et al., eds.. Cold Spring 
Harbor Laboratories. Cold Spring Hari)or, NY. 1988). Particularly, the primers may 
convenienay come from the N-terminal proximal sequence or other conserved 
25 region, such as those sequences where at least five amino acids are conserved out of 
eight amino acids in three of the four sequences. This is illustrated by the sequences 
(SEQ ID N0:11) IITPLDCFWEG, (SEQ ID N0:12) LIVGG. and (SEQ ID NO:13) 
PFFWEQY. Resulting PCR products of expected size are subcloned and may be 
sequenced if desired. 

30 The cloned PCR fragment may then be used a$ a probe to screen a cDNA 

Ubrary of mammalian tissue ceUs expressing patched, where hybridizing clones may 
be isolated under appropriate conditions of stringency. Again, the cDNA Ubrary 
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should come from tissue which expresses patched, which tissue will come within the 
limitations previously described* Clones which hybridize may be subcloned and 
rescreened. The hybridizing subclones may then be isolated and sequenced or may 
be further analyzed by employing RNA blots and in situ hybridizations in whole and 
5 sectioned embryos. Conveniently, a fragment of from about 0.5 to Ikbp of the N- 
terminal coding region may be employed for the Northern blot. 

The mammalian gene may be sequenced and as described above, conserved 
regions identified and used as primers for investigating other species. The N- 
terminal proximal region, the C-terminal region or an intermediate region may be 
10 employed for the sequences, where the sequences will be selected having minimum 
degeneracy and the desired level of conservation over the probe sequence. 

The DNA sequence encoding PTC may be cDNA or genomic DNA or 
fragment thereof, particularly complete exons from the genomic DNA, may be 
isolated as the sequence substantially free of wild-type sequence from the 
15 chromosome, may be a 50 kbp fragment or smaller fragment, may be joined to 
heterologous or foreign DNA, which may be a single nucleotide, oligonucleotide of 
up to 50 bp, which may be a restriction site or other identifying DNA for use as a 
primer, probe or the like, or a nucleic acid of greater than 50 bp, where the nucleic 
acid may be a portion of a cloning or expression vector, comprise the regulatory 
20 regions of an expression cassette, or the like. The DNA may be isolated, purified 
being substantially free of proteins and other nucleic acids, be in solution, or the 
like. 

The subject gene may be employed for producing all or portions of the 
patched protein. The subject gene or fragmwit thereof, generally a fragment of at 

25 least 12 bp, usually at least 18 bp, may be introduced into an appropriate vector for 
extrachromosomal maintenance or for int^ration into the host. Fragments will 
usually be immediately joined at the 5' and/or 3* terminus to a nucleotide or 
sequence not found in the natural or wild-type gene, or joined to a label other than a 
nucleic acid sequence. For expression, an expression cassette may be employed, 

30 providing for a transcriptional and translational initiation region, which may be 
inducible or constitutive, the coding region under the transcriptional control of the 
transcriptional initiati n r^on, and a transcriptional and translational termination 
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region. Various transcriptional initiation regions may be employed which are 
functional in the expression host. The peptide may be expressed in prokaryotes or 
eukaryotes in accordance with conventional ways, depending upon the purpose for 
wtptession. For large production of the protein, a uniceUular organism or ceUs of a 
5 higher organism, e.g. eukaryotes such as vertebrates, particularly mammals, may be 
used as the expression host, such as E. coli. B, subtilis, S. cerevisiae, and the like. 
In many situations, it may be desirable to express the patched gene in a mammalian 
host, whereby the patched gene wUl be transported to the ceUular membrane for 
various studies. The protein has two parts which provide for a total of six 
10 transmembrane regions, with a total of six extraceUular loops, three for each part. 
The character of the proton has similarity to a transporter protein. The protein has 
two conserved glycosylation signal triads. 

The subject nucleic acid sequences may be modified for a number of 
purposes, particularly where they will be used intraceUularly, for example, by being 
15 joined to a nucleic acid cleaving agent, e.g. a chelated metal ion, such as iron or 
chromium for cleavage of the gene; as an antisense sequence; or the Uke. 
Modifications may include replacing oxygen of the phosphate esters with sulfur or 
nitrogen, replacing the phosphate with phosphoramide, etc. 

With the avaUability of the protein in large amounts by employing an 
20 expression host, the protein may be isolated and purified in accordance with 

conventional ways. A lysate may be prepared of the expression host and the lysate 
purified using HPLC, exclusion chromatography, gel electrophoresis, affmity 
chromatography, or other purification technique. The purified protein wiU generally 
be at least about 80% pure, preferably at least about 90% pure, and may be up to 
25 100% pure. By pure is intended free of other proteins, as well as cellular debris. 

The polypeptide may be used for the production of antibodies, where short 
fragments provide for antibodies specific for the particular polypeptide, whereas 
larger fragments or the entire gene allow for the production of antibodies over the 
surface of the polypeptide or protein, where the protein may be in its natural 

30 conformation. 

Antibodies may be prepared in accordance with conventional ways, where 
the expressed polypq>tide or protein may be used as an immunogen, by itself or 
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of mjecu^s. a. ^™prta«. For nonodona, a„dbodi«, after one or more boo.«r 

~^.'ta^<«„maybeisola«,.U.^,e„„cy«i™mor«lized,a™ltt,e„ 
5 seeded for high aflW,y a„abo<ly bWing. immorali^, cells e , 
hybridomas, ^ ^ ^ 

descnpbo., « Monoclo™, Antibodies: A Ubon,,,, Ma„^, h,,,^ ^ ^ 
^ . Cold Spring Harbor Ubo,a»ri«. coM Spri„g a^or. New York, 1988 If 

*emRNA encoding fte heavy and Ugh. Chains ™ayb. isolated »d ' 
by Coning in E. con, and «.e heavy and Hgh. chains n»y be nU,ed » 
fcr^er «,han« U,e affinity of u,e andbody. ,1,0 antibodies n«y And use in 

diagnosticassaysfordelec«ionofthepiesenceofthePTr„™^. ... ^ 

inacnce 01 we PTC protein on the surface of 

15 ■n»n,ouse;«rcA«,gene(SEQroNa09)encodesap™tein(SEQID 

NO: 10) Which has abo., 38% identical amim, adds to ny Pre (SEQ ID NO-6) over 
^U^an-inoacids. ^is .«»n..f conservation is dispejti,r„,hrr 
oftheproten, excepting ti,ec.ten„i„al region. Then,o,«epn,ti=inalsohasa50 

20 ZT' •» ^ gene (SEQ ID 

NO: .8) «„„ains an open ..f^ ^ 

U». „ about 96% identic (98 , similar, to n,ouse;»c (SEQ ID NO:09) 7T« ' 
*un»n p^cWgene (SEQ ID N0:18,, indudtag codtag and non^g «,„ences 
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™e butterfly PTC hontotog (SEQ ID N0:4) is 1.300 amino acids long and 
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the oodmg s^iuotce. A 267bp e,o„ ft„m uk beetie g.„ 3, 
™no add protdn ftagmo,, which was found to be 44% and 5. % identical to U» 
""responding regions of fly and butterfly Prc respectively 

30 ■™' --/»^-^e is about 8 lb long and the message is present in low 

levels as ea^y asTdp. ti^ab^Klancy increasing by II and ISdpc. No«hemblo. 
u.d.ca.esadea,dec,e,s.in«,ean,o.m.ofmessag.atl7dpc. In the adult. FTC 
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Obtain further 5' sequence to ensure that one has at least a functional portion of the 
enhancer. It is found that the enhancer is proximal to the 5* coding region, a 
portion being in the transcribed sequence and downstream from the promoter 
sequences. The transcriptional initiation region may be used for many purposes, 
5 studying embryonic development, providing for regulated expression of patched 
protein or other protein of interest during embryonic development or thereafter, and 
in gene thenpy. 

The gene may also be used for gene therapy, by transfection of the normal 
gene into embryonic stem cells or into mature cells. A wide variety of viial vectors 
10 can be employed for transfection and stable integration of the gene into the genome 
of the ceUs. Alternatively, micro-injection may be employed, fusion, or the like for 
introduction of genes into a suitable host cell. See, for example, Dhawan et al. , 
Science 254, 1509-15 12 (1991) and Smith ei al. , Molecular and CeUular Biology 
(1990) 3268-3271. 

15 By providing for the production of large amounts of PTC protein, one can 

use the protein for identifying ligands which bind to the PTC protein. Particularly, 
one may produce the protein in ceUs and employ the polysomes in columns for 
isolating Ugands for the FTC protein. One may incorporate the PTC protein into 
liposomes by combining the protein with appropriate Upid surfactants, e.g. 
20 phospholipids, cholesterol, etc.. and sonicate the mixture of the PFC protein and the 
surfactants in an aqueous medium. With one or more established ligands. e.g. 
hedgehog, one may use the PTC protein to screen for antagonists which inhibit the 
binding of the Ugand. In this way. drugs may be identified which can prevent the 
transduction of signals by the PTC protein in normal or abnormal ceUs. 
25 The PTC protein, particularly binding fragments thereof, the gene encoding 

the protein, or fragments thereof, particularly fragments of at least about 18 
nucleotides, frequenfly of at least about 30 nucleotides and up to the entire gene, 
more particularly sequences associated with the hydrophiUc loops, may be employed 
in a wide variety of assays. In these situations, the particular molecules will 
30 normally be joined to another molecule, serving as a label, where die label can 
direcdy or indirectiy provide a detectable signal. Various labels include 
radioisotopes, fluorescers. chemUuminesccrs, enzymes, specific binding molecules, 
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particles, e.g. magnetic particles, and the Uke. Specific binding molecules include 
pairs, such as biotin and streptavidin, digoxin and antidigoxin etc. For the specific 
binding members, the complementary member would normally be labeled with a 
molecule which provides for detection, in accordance with known procedures. The 
5 assays may be used for detecting the presence of molecules which bind to the 

patched gene or PTC protein, in isolating molecules which bind to rhc patched gene, 
for measuring the amount of patched, either as the protein or the message, for 
identifying molecules which may serve as agonists or antagonists, or the like. 

Various formats may be used in the assays. For example, mammalian or 
10 invertebrate cells may be designed where the cells respond when an agonist binds to 
FTC in the membrane of the cell. An expression cassette may be introduced into 
the cell, where the transcriptional initiation region of patched is joined to a marker 
gene, such as p-galactosidase, for which a substrate forming a blue dye is available. 
A 1.5kb fragment that responds to PTC signaling has been identified and shown to 
15 regulate expression of a heterologous gene during embryonic development. When 
an agonist binds to the PTC protein, the cell will turn blue. By employing a 
competition between an agonist and a compound of interest, absence of blue color 
formation wUl indicate Uie presence of an antagonist. These assays are well known 
in tiie literature. Instead of ceUs, one may use the protein in a membrane 
20 environment and determine binding affinities of compounds. The PTC may be 
bound to a surface and a labeled Ugand for PTC employed. A number of labels 
have been indicated previously. The candidate compound is added with the labeled 
Ugand in an appropriate buffered medium to the surface bound PTC. After an 
incubation to ensure that binding has occurred, the surface may be washed free of 
25 any non-spedfically bound components of the assay medium, particularly any non- 
specificaUy bound labeled Ugand, and any label bound to the surface determined. 
Where the label is an enryme, substrate producing a detectable product may be used. 
The label may be detected and measured. By using standards, the binding affinity of 
the candidate compound may be determined. 
30 The availabUity of Uie gene and the protein allows for investigation of the 

development of the fetus and tiie role patched and otiier molecules play in such 
development. By employing antisense sequences of \ht patched gene, where tiie 
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sequences may be introduced in ceUs in culture, or a vector providing for 
transcription f the antisense of the patched gene introduced into the cells, one can 
investigate the role the PTC protein plays in the cellular development. By providing 
for the PTC protein or fragment thereof in a soluble form which can compete with 
5 the normal ceUular PTC protein for ligand, one can inhibit the binding of Ugands to 
the ceUular PTC protein to see the effect of variation in concentration of ligands for 
the PTC protein on the ceUular development of the host. Antibodies against PTC 
can also be used to block function, since PTC is exposed on the cell surface. 
The subject gene may also be used for preparing transgenic laboratory 
10 animals, which may serve to investigate embryonic development and the role the 
PTC protein plays in such development. By providing for variation in the 
expression of the PTC protein, employing different transcriptional initiation regions 
which may be constitutive or inducible, one can determine the developmental effect 
of the differences in PTC protein levels. Alternatively, one can use the DNA to 
15 knock out the PTC protein in embryonic stem cells, so as to produce hosts with only 
a single functional patched gene or where the host lacks a functional patched gene. 
By employing homologous recombination, one can introduce a patched gene, which 
is differentially regulated, for example, is expressed to the development of the fetus, 
but not in the adult. One may also provide for expression of the patched gene in 
20 ceUs or tissues where it is not normally expressed or at abnormal times of 
development One may provide for mis-expression or feUure of expression in 
certain tissue to mimic a human disease. Thus, mouse models of spina bifida or 
abnormal motor neuron differentiation in the developing spinal coirf are made 
available. In addition, by providing expression of PTC protein in cells in which it is 
25 otherwise not normally produced, one can induce changes in cell behavior upon 
binding of ligand to the PTC protein. 

Areas of investigation may include the development of cancer treatments. 
The wngless gene, whose transcription is regulated in fUes by PTC, is closely 
related to a mammalian oncogene, Wnt-l, a key factor in many cases of mouse 
30 breast cancer. Other Wnt femUy membcB. which are secreted signaling proteins, 
are implicated in many aspects of development. In fHes, the signaling factor 
decapentaplegic, a member of the TGF-beta famUy of signaling proteins, known to 
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affect growth aitd devdopmot in mammals. U also controlled by PTC. Since 
member of both the TCF-baa and Wm families are ocprossed in mice m places 
dose u. overlapping with peeked, the common regtUation provides an opportumty 
in tteating cancer. Also, for repair and regenenttion, prolifennion competent cells 
5 „«ld„gFIC protein can find use to promou^regenetanon and healing for damaged 

tissue, which tissue may be tegenerated by .ransfecting ceUs of damag»l tissue wtth 
the P.C gene and its normal transcripdon initiation region or a modified transcnptton 
initiation region. For example. PTC may be useful to stimuUte growth of new «eth 
by engineering cdls of the gum, or other dssues where PTC protein was dunng an 
10 eaiUer developmental stage or is expressed. 

Since Northern blot analysis indicates Utat ptc is present at high levels in 
adult lung tissue, the regubdon of p.c expression or binding to its natural Ugand 
„„y serve to inhibit proUferation of cancerous lung celU. -Hte availabiUty of the 
gene encoding FTC and the expression of the g«.e allows for the development of 
15 agonists and antagonists. In addition. PTC is centnU to the abiUty of n«m.ns to 
differentiate early in development. The availabiUty of the gen. aUows for the 
introduction of WC into host diseased tissue, stimulating the fetal program of 
division and/or differentiation. TOs could be done in conjunction with other genes 
which provide for d« Ugands which regulate PTC activity or by providing for 
20 agonists other than the natural Ugand. 

The availabiUty of the coding region for various ptc genes from vanous 
species, allows for the isolation of the non-coding region comprising the promoter 
and othancer associated with the„c g«..s. so as to provide transcHpdonal and post- 
..anscriptional rogulation of the ;»c gene or other genes, which aUow for regulation 
25 „fgenesmrdationtothen:gulationoftheplcg«»:. Since ttte pre gene .s 

autorogulaled. actt«Uion of the gene wiU result in activation of tnmscnption of a 
gene under Ute tnmscriptional control of tite transcriptional initiation regton of ttte 
p,c gene, m transcriptional initiation rogion may be obtained from any host 
<^ and introduced into a heterologous host species. wh«e such imtiation region 
30 isfunctionalt.ti.edesi.edd.g.«inthefo.eignh„st For example, a ftagmen. of 
from about 1.5 kb upstream from the imtiation codon. up to about lOkb. proferably 
up «, about 5 kb may he used to provide for transcriptional initiation rogulated by 
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the PTC protein, particularly the Drosophila 5'-non-coding region (GenBank 
accession no. M28418). 



The foUowing examples are offered by Ulustration not by way of limitation 

5 

^' PCR on MnsniiitQ (Annnhe}^^ 9nmhin^\ n^n nmir nwA . 

PGR primers were based on amino acid stretches of tty PTC that were not 
10 likely to diverge over evolutionary time and were of low degenenicy. Two such 
primers (P2R1 (SEQ ID NO: 14): GfiACfi A ATrrAARGTNCAYCARYTNTGG. 
P4R1: (SEQ ID NO:15) GfiACfi A ATTrCYTCCCARAARCANTC, (the 
underlined sequences are Eco RI linkers) ampUfied an appropriately sized band from 
mosquito genomic DNA using the PCR. The program conditions were as follows: 
15 94 "C 4 min.; 72 °C Add Taq; 

[49 -C 30 sec.; 72 'C 90 sec.; 94 'C 15 sec] 3 times 
[94 'C 15 sec.; 50 'C 30 sec.; 72 'C 90 sec] 35 times 
72 "C 10 min; 4 "C hold 

This band was subcloned into the EcoRV site of pBluescript H and sequenced using 
20 the USB Sequence kit. 

°- Screen of a Bimrrflv rPNA T ihr^nr wifh M^cq ..i. ^ ppp p^,, ^ ^ 

Using the mosquito PCR product (SEQ ID NO:7) as a probe, a 3 day 
embryonic Precis coema AgtlO cDNA Ubrary (generously provided by Sean 

25 Carroll) was screened. Hlters were hybridized at 65 'C overnight in a solution 
containing 5xSSC, 10% dextran sulfete. 5x Denhardfs. 200 ^g/ml sonicated 
salmon sperm DNA, and 0.5% SDS. FUters were washed in O.IX SSC, 0.1% SDS 
at room temperature several times to remove nonspecific hybridization. Of tiie 
100.000 plaques initiaUy screened, 2 overlapping clones. LI and L2. were isolated, 

30 which corresponded to UieN terminus of butterfly PTC. Using L2 as a probe, ti»e 
Ubrary filters were rcscreened and 3 additional clones (L5, L7, L8) were isolaLd 
which encompassed the remainder f thep/c coding sequence. The full lengtii 
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sequence 
sequencing 



of butterfly ptc (SEQ ID N0:3) was determined by ABI automated 



ni. c^rmn r^- rrihnm ,m fT-^'"^ r.^-r.\r t ihn.rv with Mftsniiitn PCR Product 

5 J^J^^ Q fp hp Fr jf m""* Rnttprflv Clone 

A Xgemll genomic Ubraiy from TriboUum casteneum (gift of Rob Dennell) 
was probed with a mixture of the mosquito PGR (SEQ ID N0:7) product and 
BstXI/EcoRI fragment of U. FUters were hybridized at 55 'C overnight and 
washed as above. Of the 75.000 plaques screened, 14 clones were identified and the 

10 Saci fragment of T8 (SEQ ID NO:l). which crosshybridized with the mosquito and 
butterfly probes, was subcloned into pBIuescript. 

rv. 

^nnirrv^ P""'" ^"''^^ Hnmfilogues 
15 Two degenerate PGR primers (P4REV: (SEQ ID NO:l6) 

^r^ArnAATrC YTNGANTGYTTYTGGGA; P22: (SEQ ID NO: 17) 
^^T^r-r^ft ^^rrAAnmTGT CIGGCCARTGCAT) were designed based on a 
comparison of FTC amino acid sequences from fly {Drosophila melanogaster) (SEQ 
ID N0:6). mosquito {Anopheles gambiae)iSEQ ID N0:8). butterfly {Precis 
20 coenifl)(SEQIDNO:4),andbeetie(rri*o/f«mcaJ«/i««m)(SEQIDNO:2). I 
represents inosine, which can form base pairs with all four nucleotides. P22 was 
used to reverse transcribe RNA from 12.5 dpc mouse limb bud (gift from David 
Kingsley) for 90 min at 37 "C. PGR using P4REV(SEQ ID N0:17) and P22(SEQ 
ID NO: 18) was then performed on 1 m1 of the resultant cDNA under the foUowing 

25 conditions: 

94 X 4 min.; 72 Add Taq; 
[94 "C 15 sec.; 50 °C 30 sec.; 72 'G 90 sec.] 35 times 
72 "C 10 min.; 4 °C hold 
PGR products of ti.e expected size were subcloned into the TA vector (Invitrogen) 
30 and sequenced with the Sequenase Version 2.0 DNA Sequencing Kit (U.S.B.). 
Using the cloned mouse PGR fragment as a probe, 300,000 plaques of a 
mouse 8.5 dpc AgtlO cDNA Ubrary (a gift from Brigid Hogan) were screened at 
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65'C as above and washed in 2x SSC, 0. 1 % SDS at rx»m temperature. 7 clones 
were isolated, and three <M2 M4, and M8) were subcloned into pBluescript II. 
200,000 plaques of this library were rescreened using first, a 1. 1 kb EcoRI fragment 
from M2 to identify 6 clones (M9-M16) and secondly a mixed probe containing the 
5 most N terminal (Xhol fragment from M2) and most C terminal sequences 

(BamHI/Bgin fragment from M9) to isolate 5 clones (M17-M21). M9, MIO, M14, 
and MI7-21 were subcloned into the EcoRI site of pBluesciipt H (Strategene). 

^- RNA Bloh and in siti i Hyhridiiarinns in Whoi^ and Sprtion^ Mmi<!P Fmhryo^ 
10 Northerns: 

A mouse embryonic Northern blot and an adult multiple tissue Northern blot 
(obtained from Qontech) were probed with a 900 bp EcoRI fragment from an N 
terminal coding region of mouxptc. Hybridization was performed at 65 "C in 5x 
SSPE, lOx Denhardt's, 100 /ig/ml sonicated salmon sperm DNA, and 2% SDS. 
15 After several short room temperature washes in 2x SSC, 0.05 % SDS, the blots were 
washed at high stringency in 0. IX SSC, 0. 1 % SDS at 50C. 
In situ hybridization of sections: 

7.75, 8.5, 11.5, and 13.5 dpc mouse embryos were dissected in PBS and 
frozen in Tissue-Tek medium at -80 "C. 12-16 urn frozen sections were cut, 

20 coUected onto VectaBond (Vector Laboratories) coated slides, and dried for 30-60 
minutes at room temperature. After a 10 minute fixation in 4% paraformaldehyde in 
PBS, the sUdes were washed 3 times for 3 minutes in PBS, acetylated for 10 minutes 
in 0.25% acetic anhydride in triethanolamine, and washed three more times for 5 
minutes in PBS. Prehybridization (5096 formamide, 5X SSC, 250 ^g/ml yeast 

25 tRNA, 500 /tg/ml sonicated salmon sperm DNA, and 5x Denhardt's) was carried 
out for 6 hours at room temperature in 50% fbrmamide/Sx SSC humidified 
chambers. The probe, which consisted of 1 kb from the N-terminus of ptc, was 
added at a concentration of 200-1000 ng/ml into the same solution used for 
prehybridization, and tijen denatured for five minutes at 80 "C. Approximately 75 

30 Ml of probe were added to each sUde and covered with Paiafilm. The slides were 
incubated overnight at 65 'C in tiie same humidified chamber used previously. The 
foUowing day, the probe was washed successively in 5X SSC (5 minutes, 65 'Q, 
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0.2X SSC (1 hour, 65 "C), and 0.2X SSC (10 minutes, room temperature). After 
five minutes in buffer Bl (O.IM maleic acid, 0.15 M NaCl, pH 7.5). the sUdes were 
blocked for 1 hour at room temperature in 1 % blocking reagent (Boerhinger- 
Mannheim) in buffer Bl, and then incubated for 4 hours in buffer Bl containing the 

5 DIG-AP conjugated antibody (Boerhinger-Mannheim) at a 1:5000 dilution. Excess 
antibody was removed during two 15 minute washes in buffer Bl, followed by five 
minutes in buffer B3 (100 mM Tris, lOOmM NaCl, 5mM MgClj, pH 9.5). The 
antibody was detected by adding an alkaline phosphatase substrate (350 fil 75 mg/ml 
X-phosphate in DMF, 450 fil 50 mg/ml NBT in 70% DMF in 100 mis of buffer B3) 

10 and allowing the reaction to proceed over-night in the dark. After a brief rinse in 10 
mM Tris, ImM EDTA, pH 8.0, the slides were mounted with Aquamount (Lemer 
Laboratories). 

VI. nmfnphila 5-tTanscriptinnal initia rinn region P-gal finn<!tn]Cts. 
15 A series of constructs were designed that link different regions of the ptc 

promoter from Drosophila to a LacZ reporter gene in order to study the cis 
regulation of the ptc expression pattern. See Fig. I. A lO.Skb BamHI/BspMl 
fragment comprising the 5'-non-coding region of the mRNA at its 3'-terminus was 
obtained and truncated by restriction enzyme digestion as shown in Fig. 1. These 
20 expression cassettes were introduced into Drosophila lines using a P-element vector 
(Thummel et al., Gene 74, 445-456 (1988), which were injected into embryos, 
providing flies which could be grown to produce embryos. (See Spradling and 
Rubin, Science (1982) 218, 341-347 for a description of the procedure.) The vector 
used a pUC8 background into which was introduced the white gene to provide for 
25 yellow eyes, portions of the P-dement for integrtion, and the constructs were 
inserted into a polylinker upstream from the LacZ gene. The resulting embryos 
were stained using antibodies to LacZ protein conjugated to HRP and the embryos 
developed with OPD dye to identify the expression of the LacZ gene. The staining 
pattern is described in Fig. 1, indicating whether there was staining during the early 
30 and late devdopment of the embryo. 

Vn. T^nlatinn of a Mouse ntc Gene 
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Homologues of fly PTC (SEQ ID N0:6) were isolated from three insects- 
mosquito, butterfly and beetle, using either PCR or low stringency librao' screens. 
PCR primers to six amino acid stretches of PTC of low mutatability and degeneracy 
were designed. One primer pair. P2 and P4, amplified an homologous fragment of 
5 ptc from mosquito genomic DNA that corresponded to the first hydnq>hilic loop of 
the protein. T^e 345bp PCR product (SEQ ID N0:7) was subcloned and sequenced 
and when aUgned to fly PTC, showed 67% amino acid identity. 

The cloned mosquito fragment was used to screen a butterfly ACT 10 cDNA 
Ubrary. Of 100.000 plaques screened, five overlapping clones were isolated and 
10 used to obtain the full length coding sequence. The butterfly FTC homologue (SEQ 
ID N0:4) is 1,311 amino acids long and overall has 50% amino acid identity (72% 
similarity) to fly FTC. With the exception of a divergent C-terminus. this homology 
IS evenly spread across the coding sequence. The mosquito PCR clone (SEQ ID 
N0:7) and a corresponding fragment of butterfly cDNA were used to screen a beetle 
15 Ageml 1 genomic library. Of the plaques screened. 14 clones were identified. A 
fragment of one clone (T8). which hybridized with the original piobes, was 
subcloned and sequenced. This 3kb piece contains an 89 amino acid exon (SEQ ID 
N0:2) which is 44% and 51 % identical to the corresponding regions of fly and 
butterfly PTC respectively. 

20 Using an alignment of the four insect homologues in the first hydrophUic loop 

of the PTC, two PCR primers were designed to a five and six amino acid stretch 
which were identic^ and of low degeneracy. n,ese primers were used to isolate the 
mouse homologue using RT-PCR on embryonic limb bud RNA. An appropriately 
sized band was ampUfied and upon cloning and sequencing, it was found to encode a 

25 protem 65% identical to fly PTC. Using the cloned PCR preduct and subsequently 

ft3gments of mouseprc cDNA. a mouse embryonic AcDNA Kbrary was screened 

From about 300.000 plaques, 17 clones were identified and of these. 7 form 

overlapping cDNA's which comprise most of the pretein-coding sequence (SEQ ID 
N0:9) . 

30 
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n mRNA is present in low levels as early as 7 dpc ana 
Developmentally, ptc mRNA is pr»c . ,7 Hnc the 

u ti,nHi5doc While the gene is still present at 17 dpc, the 
quite abundant by 1 1 and 15 dpc. wmic u b 

.... decrease in the amount of message at this stage, in 

5 Northern blot indicates a clear decrease mm 

w . u RNA is present in high amounts in the brain and lung, as weu 
the adult, ptc RNA ts pre^ ^ ^^^^ 

moderate amounts in the kidney and Uver. weait 6 

spleen, skeletal muscle, and testes. 




0 vnb. /fi Hint nynTiiiifflii* , -7 j-^ while there is 

Nonhen. ^nsiysU indi««s tt«.p.c nOU^A ,s P'-^ 

along the neural axis Of 8.5 dpc embryos. By 11.5 dpc. pre can 
. no^.-gbudsandgut.consistentwithitsadultNo^^^^^^^ 

addition, the gene is present at high levels in the ventricular zone o^e « 

,t.m as well as in the zona limitans of the prosencephalon, ptc is also 
nervous system, as weu as in m ^ , , . n 5 doc Umb buds, as 

strongly transcribed in the condensing cartilage of 11.5 and 13.5 dpc U 
IJin the ventral|«rtionofthesomites,aregion Which ispro^^^^^ 

.0 — andeventuauyformsboneinthevertebralcolum. 

:l range of tissues f^m endodermal. mesodermal and ectoderm 
supporting its fundamental role in embryonic development. 
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' irrhotcWx 10^ plaques from ahuman lung cDNA library 
To isolate human ptc (hptc), z x ivr pum » w, 7 

«^ned with a Ikbp mouse ptc fragment. Mi-^-, 
(HU022a.aonetech) were screened with a P 5^,10% 

PUterswerehybridizedovemightatreducedstnngency 60 Cm5X 
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sequence of mouse pre) probes. Ten plaques were purified and of these, 6 inserts 
were subloned into pBluescript. To obtain the full coding sequence, H2 was fully 
and H14, H20, and H21 were partially sequenced. The S.lkbp of human ptc 
sequence (SEQ ID NO: 18) contains an open reading frame of 1447 amino acids 
5 (SEQ ID NO:19) that is 96% identical and 98% similar to mouse ptc. The 5' and 3* 
untranslated sequences of human ptc (SEQ ID NO: 18) are also highly similar to 
mouse p/c (SEQ ID NO:09) suggesting conserved regulatory sequence. 

IX. CommriSOn of Mouse HnmaTi Fly anH Q iiftprfly SSpgn^nrp ^ 

0 The deduced mouse PTC protein sequence (SEQ ID NO: 10) has about 38% 
identical amino acids to fly PTC over about 1,200 amino acids. This amount of 
conservation is dispersed through much of the protein excepting the C-terminal 
region. The mouse protein also has a SO amino acid insert relative to the fly 
protein. Based on the sequence conservation of PTC and the functional conservation 

) of hedgehog between fly and mouse, one concludes that ptc functions simUarly in 
the two organisms. A comparison of the amino acid sequences of mouse (mptc) 
(SEQ ID NO: 10), human (hptc) (SEQ ID NO: 19), butterfly (bptc)(SEQ ID NO:4) 
and drosophila (ptc) (SEQ ID N0:6) is shown in Table 1. 



alignment of human, mouse, fly, and butterfly PTC homologs 

alignment of human, mouse, fly, and butterfly ptc homologs 



TABLE 1 



HPTC 
MPTC 
PTC 
BPTC 




MASAGHAAEPODR—GGGGSGCIGAPGRPAGGGRRRRTCGLRRAAAPDRDYLHRPSYCDA 



HPTC 
MPTC 
PTC 
BPTC 



AFALEQlSKGKATGRKAPLWLRAKrQRLI.FiaGCYIQKMCGKFLWGLLIFGArRVGIJCA 
ATALBQISKGKATGRKAPLWLRAKFQRLLFKLGCYrQKNCGKFLWGLLIFGAFAVGLKA 
QVALDQIDKGKRRGSRTAIYLRSVFQSHLETLGSSVQKHAGKVI.EVAILVLSTFCVGLKS 
AlALSELEKCTJIEGGRTSUriRAWLQEQLFILGCFLQGDAGKVLFVMLVLSTFCVGLKS 




HPTC 
MPTC 
PTC 
BPTC 



ANMTNVEELWVBVGGRVSRELMYTRQKIGEEAMFNPQLMlQTPKEEGftNVLTTEALLQH 
ANLETNVEELWVEVGGRVSRELNYTRQKIGEBAMFNPQLMIQTPKEEGANVLTTEALLQH 
AQIHSKVHQLWIQBGGRLEAELAYTQKTISEDESATHQLLIQTTHDPNASVLHPQALLAH 
AQIHTRVDQUIVQEGGRLEAEIJCYTAQALGEADSSTHQLVIQTAKDPDVSLLHPGAU:,EH 



HPTC 



LDSALQASRVHVYMyNRQWKLEHI.CYKSGELITET-GYMDQIIEYLYPCLIITPLDCFWE 
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MPTC 

PTC 

BPTC 



HPTC 
MPTC 
PTC 
BPTC 



10 



HPTC 
MPTC 
PTC 
15 BPTC 



HPTC 
MPTC 

20 PTC 

BPTC 



HPTC 
25 MPTC 
PTC 
BPTC 



30 HPTC 
MPTC 
PTC 
BPTC 



35 



40 



HPTC 
MPTC 
PTC 
BPTC 



HPTC 
MPTC 
PTC 
45 BPTC 



HPTC 
MPTC 

50 PTC 

BPTC 



HPTC 
55 MPTC 
PTC 
BPTC 



60 HPTC 
MPTC 
PTC 
BPTC 
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BOSVRPBSTOK '^'!™T'???SlF|S^I»SIVIOTAVTVI.IArCTLLRW»DP 

** . * • • 

SKSQGaVGI^<^LI.v;U.V^-JCS"GXS 

SKSQGAVGIAGVLLVM^VAAGLGI^SWGIsnWA^ 

^mGQSSVG^mG^aIWC^STAAGLGLSA^iG^^ 

IRSQAGVGIAGVLIJ^ITVAAGLG^CAIiGIPF^^STQIVP£i^ . . 

_«_.*♦.**♦**. *•***. .*.*• ** •• 

SOTJ^RLLVFPAmSMUWRTAGRMirCCCF lPKKKIPER 

TNLGSILLVFPRMISLDLRRRSAARADLLCCm-P- ESP 
**.***..*.** ** * *•••* * 

PPYTSHSFAHETHITMQSTVQLRTEYDPHTHVYYTTMPWEIS^ 

II IIIIIIIIII-------- AKTRKNDKTHRID-TTRQPLDPDVS 

— — — — — — — — , » • . • ' 

• • • • 

.STSSTRDU^QrSDSSI^CXXPP™ 
ESTSSTRDI^OFSDSSLHC^PCTM^^SFAEra^^ 

TSVWGATI0na)6LDLT01VPEOTDEHEFI5RQEKYFGFYNl«AVTaeN ^ 
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HPTC 
MPTC 
PTC 
BPTC 



Hptc 

MPTC 
PTC 
10 BPTC 



32^^EX^^^"^^^^^^^°™^^^FSEWLGNI^KIFDEEYRDGRLTK^ 

YHDQrVRIPNIIKNDNGGLTKFWLSLFRDWLLDLQVAFDKEVASGCITQEYWCKNASDEG 
* * * * * ^ * * 

IIAYKLIVQTGHVDNPVDKELVLT-NRLVNSDGIINQRAFyNYLSAWATNDVFAYGASQG 
IIAYKIMVQTGHVDNPIDKSLITAGHRLVDKDGIINPKATYNYLSAWATNDAIAYGASQG 



HPTC 
MPTC 
15 PTC 
BPTC 



HPTC 
20 MPTC 
PTC 
BPTC 



25 HPTC 
MPTC 
PTC 
BPTC 



30 



35 



HPTC 
MPTC 
PTC 
BPTC 



HPTC 
MPTC 
PTC 
40 BPTC 



HPTC 
MPTC 

45 PTC 

BPTC 



NIRPHRPEWVHDKADYMPETRLRIPAAEPIEYAQFPFYLNGLRDTSDFVEAIEKVRTICS 
NIRPHRPEWVHDKADYMPETRLRIPAAEPXEYAQFPFYLNGLRDTSDFVEAIEKVRVICN 

KLYPEPRQYFHQPNEY DLKIPKSLPLVYAQMPFYLHGLTDTSQIKTLIGHIRDLSV 

NLKPQPQRWIHSPEDV HLEIKKSSPLIYTQLPFYLSGLSDTDSIKTLIRSVRDLCL 



NYTSLGLSSYPNGYPrLFWEQYIGLRHWLLLFISWlACTrLVCAVFLlNPWTAGIIVMV 

NYTSLGLSSyPNGYPFLrWBQYISLRHWLLLSISWIACTrLVCAVrLLNPWTAGIIVMV 

KYEGFGLPhrfPSSIPFlFWEQYMTlRSSLAMILACVLIAALVLVSLLLLSVWAAVLVILS 

KYEAKGLPNFPSGIPFLFHE0VLyLRTSLLIJUACALGAVFIAVMVU,IJ«JOVAVLVTLA 
.* . ♦*...*.* **.*****. *♦ • « 



lAIMVELFOMGLIGIKI^AVPVVILIASVGIGVEFTVHVAlAFLTAIGDKNRRAVLAL 
lALMTVELFGMMGLIGIKLSAVPWI LIASVGIGVEFTVHVALAFLTAIGDKNHRAMLAL 
™,^25f*^^^^'^'"^*^^"^V®»«'CFNVLISLGFMTSVGNRQRRVQLSM 
IATLVLQLL6VMALLGVKLSAMPPVLLVLA1GRGVHFTVHLCLGFVTSIGCKRRRASLAL 



*.*••••• • *♦< 



EroffAPVODGAVSTLLGVUOAGSEFDFlVRyFFAVIAILTILGVLNGLVLLPVLLSFFG 
EHMFAPVLDGAVSTLI.GVLMLAGSEFDFIVRyFFAVLAILTVLGVLNGLVLI.PVLLSFFG 
QMSLGPLVHOa.TSGVAVFMLSTSPFEFVIRHFCWLLLWLCVGACNSLLVFPILLSMVG 
ESVIAPVVHGALAAAIAASMLAASEFGFVARLFLRLLLALVFLGLIDGLLFFPIVLSILG 



* ♦ * * 



* ** * 



PCPEVSPANGLNRLPTPSPEPPPSWRFAVPPGHTNNGSDSSDSEYSSQTTVSGISE-EL 
f^^!i^"^°*^^^''^^^^^^^'®S^^«SSRSSRGSCQKSHHHHHKDLNDPSL 

PAAEVRPIEHPERLSTPSPKCSPIHPRKSSSSSGGGDKSSRTS—KSAPRPC ^APSL 

•••• ♦ ♦ ♦ 

^55^?^^^^°^^^^E^PV™STVVHPESRHHPPSNPRQQPHLDSGSLPPGRQ 
F^YEAQQGAGGPAHQVIVEATENPVFARSTVVHPDSRHQPPLTPRQQPHLDSGSLS^ 

TTITEEPSSWHSSAHSVQSSMQSIWQPEWVETTTYNGSDSASGRSTPTKSSHGGAITT 



HPTC 
50 MPTC 
PTC 
BPTC 



GMPRRDPPREGLWPPLYRPRRDAFEISTEGHSGPSNRARWGPRGARSHNPRNPASTAMG 
GQQPRRDPPREGLRPPPYRPRRDAFEISTEGHSGPSNRDRSGPRGARSHNPRNPTSTAMG 

ll^Z^lT^f" YPPELQSIWQPEVTVETTHS DS 

TKVTATANIKVEWTPSDRKSRRSYHYYDRRRDRDEDRDRDRERDRDRDRDRDRDRDRDR 



55 HPTC 
MPTC 
PTC 
BPTC 



60 



65 



HPTC 
MPTC 
PTC 
BPTC 



SSVPGYCQPITTVTASASVTVAVHPPPVPGPGRNPRGGI.CPGY PETDHGLFEDPHVP 

SSVPSYCQPITTVTASASVTVAVHPP—PGPGRNPRGGPCPGYESYPETDHGVFEDPHVP 

HT TKVTATANIKVELAMP GRAVRS YNFTS 

DR DRERSRERDRRDRYRD ERDHRA SPRENGRDSGHE 



FHVRCERRDSKVEVIELQDVECEERPRGSSSN 
FHVRCERRDSKVEVIELQDVECEERPWGSSSN 



-SDSSRH 
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The identity of ten other clones recovered from the mouse library is not 
determined. These cDNAs cross-hybridize with mouse pre sequence, while differing 
as to their restriction maps. These genes encode a family of proteins related to the 
5 patched protein. Alignment of the human and mouse nucleotide sequences, which 
includes coding and noncoding sequence, reveals 89% identity. 

In accordance with the subject invention, mammalian patched genes, including 
the mouse and human genes, are provided which allow for high level production of 

10 ih^patched protein, which can serve many purposes. T^it patched protein may be 
used in a screening for agonists and antagonists, for isolation of its Ugand, 
particularly hedgehog, more particularly Some hedgehog, and for assaying for the 
transcription of the mRNA ptc. The protein or fragments thereof may be used to 
produce antibodies specific for the protein or specific epitopes of the protein. In 

15 addition, the gene may be employed for investigating embryonic developmem, by 
screening fetal tissue, preparing transgenic animals to serve as models, and the like. 

All publications and patent applications cited in this specification are herein 
incorporated by reference as if each individual publication or patent application were 
20 specifically and individually indicated to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
illustration and example for purposes of clarity of understanding, it will be readUy 
apparent to those of ordmary skill in die art in light of the teachings of tius invention 
25 that certain changes and modifications may be made thereto witiiout departing from 
the spirit or scope of the appended claims. 



30 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION : 

(i) APPLICANT: THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR 
UNIVERSITY 

(ii) TITLE OF INVENTION: Patched Genes and their Use 
(ill) NUMBER OF SEQUENCES: 19 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Flehr, Hohbach, Test, Albritton & Herbert 

(B) STREET: Four Bmbarcadoro Center, Suite 3400 

(C) CITY: San Francisco 

(D) STATE: CA 

(E) COUNTRY: US 

(F) ZIP: 94111 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US95/ 

(B) FILING DATE: 06-OCT-1995 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

<A) NAME: Rowland, Bertram I 

(B) REGISTRATION NUMBER: 20015 

(C) REFERENCE/DOCKET NUMBER: a60190-l 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHCmE: 415-781-1989 

(B) TELEFAX: 415-398-3249 



(2) INFORMATION FOR SEQ ID NOil: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 736 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genonic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
AACNNCNNTN NATCGCACCC CCNCCCAACC TTTNNNCCNN NTAANCAAAA NNCCCCNTTT 
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NATACCCCCT NTAANANTrr TCCACCMNNC NNAWUiNCCN CTGNANACNA NONAAANCCN 120 
TTTTTNAACC CCCCCCACCC CCAATTCCNA NTNNCCNCCC CCAAATTACA ACTCCAGNCC 180 
AAAATTNANA NAATTGOTCC TAACCTAACC NATNGTTGTT ACGGTTTCCC CCCCCAAATA 
CATGCACTCC CCCGAACACT TCATCGTTGC CGTTCCAATA AGAATAAATC TCGTCATATT 
AAACAAGCCN AAAGCTTTAC AAACTGTTGT ACAATTAATG GGCGAACACG AACTGTTCGA 
ATTCTGGTCT GGACATTACA AAGTGCACCA CATCMGATGG AACCAGGACA AGGCCACAAC 
CGTACTGAAC GCCTGGCAGA AGAAGTTCGC ACAGGTTCGT GGTTGGCGCA AGGAGTAGAG 
TGAATCGTGG TAATTTTTGC TTGTTCCACG AGGTGGATCG TCTGACGAAG AGCAAGAAGT 
CGTCGAATTA CATCTTCGTG ACGTTCTCCA CCGCCAATTT GAACAAGATG TTGAAGGAGC 
CGTCGAANAC GGACGTGGTG AAGCTGGGGG TGGTGCTGGG GGTGGCGGCG GTGTACGCGT 
GGGTGGCCCA GTCGGCGCTG GCTGCCTTGG GAGTGCTGGT CTTNGCGNGC TNCMATTCGC 
CCTATAOTNA GNCGTA 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANOBDNESS: single 

(D) TOPOLOGY: linear 

(ii) KOLECOLE TYPE: protein 



240 
300 
360 
420 
480 
S40 
600 
660 
720 
736 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Xaa Pro Pro Pro Asn Tyr Asn Ser Xaa Pro Lys Xaa Xaa Xaa Leu Val 
1 5 10 15 

Leu Thr Pro Xaa Val Val Thr Val Ser Pro Pro Lys Tyr Met His Trp 
20 25 30 

Pro Glu His Leu He Val Ala Val Pro He Arg He Asn Leu Val He 
35 40 45 

Leu Asn Lys Pro Lys Ala Leu Gin Thr Val Val Gin Leu Met Gly Glu 
50 55 60 

His Glu Leu Pbe Glu Phe Trp Ser Gly His Tyr Lys Val His His He 
65 70 75 80 

Gly Trp Asn Gin Glu Lys Ala Thr Thr Val Leu Asn Ala Trp Gin Lys 
85 90 95 

Lys Phe Ala Gin Val Gly Gly Trp Arg Lys Glu 



26 



WO9d/11260 



PCT/US95/13233 



100 105 
(2) IMFORMATION FOR SZQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH: 5187 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cONA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GOGTCTGTCA CCCGGAGCCG GAGTCCCCCC C6CCCAGCAG CGTCCTCGOG AGCCGAGCGC 
CCAGGCGCGC CCGGACCCOG CGGCGGCGGC GGCAACATGG CCTCGGCTGG TAACGCCGCC 
GGGGCCCTGG GCAGGCAGGC CCGCGGCGGG AGGCGCACAC GGACCGGGGG ACCGCACCGC 
CCCGCGCCGG ACCGGGACTA TCTCCACCGG CCCAGCTACT GCGACGCCGC CTTCGCTCTG 
GAGCAGATTT CCAAGGG6AA GGCTACTGGC CGGAAAGCGC CGCTGTGGCT GAGAGCGAAG 
TTTCAGAGAC TCTTATTTAA ACTGGGTTCT TACATTCAAA AGAACTGCCG CAAGTTTTTG 
CTTGTGGGTC TCCTCATATT TCGGGCCTTC GCTGTGGGAT TAAAGGCAGC TAATCTCGAG 
ACCAACGTCG AGGAGCTGTG GGTGGAACTT GGTCGACGAG TGAGTCGAGA ATTAAATTAT 
ACCCGTCAGA AGATAGGAGA AGACGCTATG TTTAATCCTC AACTCATGAT ACAGACTCCA 
AAAGAAGAAG GCGCTAATGT TCTGACCACA GAGGCTCTCC TGCAACACCT CGACTCAGCA 
CTCCAGGCCA 6TCGTCTGCA CGTCTACATG TATAACAGGC AATGCAACTT CGAACATTTG 
TGCTACAAAT CAGGGGAACT TATCACGCAG ACAGGTTACA TGGATCACAT AATAGAATAC 
CTTTACCCTT GCTTAATCAT TACACCTTTG GACTGCTTCT GGGAAGGGGC AAAGCTACAG 
TCCGGGACAG CATAOCTCCT AGGTAAGCCT CCTTTACGGT GGACAAACTT TGACCCCTTG 
GAATTCCTAC AAGAGTTAAA GAAAATAAAC TACCAAGTGG ACAGCTCCGA GGAAATGCTG 
AATAAAGCCG AAGTTGGCCA TGGGTACATC GACCCGCCTT GCCTCAACCC AGCCGACCCA 
GATTGCCCTG CCACAGCCCC TAACAAAAAT TCAACCAAAC CTCTTCATGT CGCCCTTCTT 
TTGAATGGTG GATGTCAAGG TTTATCCAGG AAGTATATGC ATTGGCAGGA GCAGTTGATT 
GTGGGTGGTA CC6TCAAGAA TCCCACTCGA AAACTTGTCA GCGCTCACGC CCTGCAAACC 
ATGTTCCACT TAATCACTCC CAACCAAATG TATCAACACT TCAGGGGCTA CGACTATGTC 
TCTCACATCA ACTCGAATGA AGACAGGGCA GCCGCCATCC TCCACGCCTG GCAGAGGACT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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TACGTGGAGG 
ACAACCACGA 
GCCAGCGGCT 
TCCAAGTCCC 
GCAGGATTGG 
TTGCCGTTTC 
AGTGAAACAG 
CGCACCGGAG 
GCATTGATCC 
TTCAATTTTG 
CGTGAGGACA 
ATTCAAGTTG 
CCCCCATACA 
CAGCTCCGCA 
TCTGAGATCT 
GAGAGCACCA 
CTCGAGCCCC 
TTCCTCCTGA 
GTCAGCCTTT 
CGGGAAACCA 
ATGTATATAG 
CATAAGAGTT 
ATGTGGCTGC 
TGGGAAACTG 
GCTTACAAAC 
ACTAAACAGC 
CTGACCGCTT 
CCrCACCGGC 
ATCCCAGCAC 



TGGTTCATCA 
CCCTGGACGA 
ACCTACTGAT 
AGGGTGCCGT 
GCCTCTGCTC 
TTGCTCTTGG 
GACAGAATAA 
CCAGCGTGGC 
CTATCCCTGC 
CTATGGTTCT 
GAAGATTGGA 
AGCCACAGGC 
CCAGCCACAG 
CAGAGTATGA 
CTGTACAGCC 
GCTCTACCAC 
CCTGCACCAA 
AACCCAAAGC 
ATGGGACCAC 
GAGAATATCA 
TCACCCAGAA 
TCAGCAATGT 
ACTACTTTAG 
GGAGGATCAT 
TCCTGGTGCA 
GTCTGCTAGA 
GGGTCAGCAA 
CGGAGTGGGT 
CAGAGCCCAT 



AAGTGTCGCC 
CATCCTAAAA . 
GCTTGCCTAT 
GGGGCTGGCT 
CTTGATTCGC 
TGTTGGTGTG 
GAGGATTCCA 
CCTCACCTCC 
CCTGCGAGCG 
GCTCATTTTT 
TATTTTCTGC 
CTACACAGAG 
CTTCGCCCAC 
CCCTCACACG 
TGTTACCCTC 
GGACCTGCTC 
GTGGACACTC 
CAACGTTGTG 
CCGAGTGAGA 
CTTCATAGCT 
AGCAGACTAC 
GAAGTATGTC 
ACACTGGCTT 
GCCAAACAAT 
GACTGGCAGC 
CGCAGATGGC 
CGACCCTGTA 
CCATGACAAA 
CGAGTACGCT 



CCAAACTCCA 
TCCTTCTCTG 
GCCTGTTTAA 
CGCGTCCTGT 
ATTTCTTTTA 
GATGATGTCT 
TTTGAGGACA 
ATCAGCAATG 
TTCTCCCTCC 
CCTGCAATTC 
TGTTTCACAA 
CCTCACAGTA 
GAAACCCATA 
CACGTGTACT 
ACCCAGGACA 
TCCCAGTTCT 
TCTTCGTTTG 
GTAATCCrrC 
GACGGGCTGG 
GCCCACTTCA 
CCGAATATCC 
ATCCTGGAGG 
CAAGGACTTC 
TATAAAAATG 
CGAGACAAGC 
ATCATTAATC 
GCTTACGCTG 
GCCGACTACA 
CAGTTCCCTT 



CTCAAAAGGT 
ATGTCAGTGT 
CCATGCTGCG 
TGGTTGCGCT 
ATGCTGCGAC 
TCCTCCTGGC 
GGACTGGGGA 
TCACCGCCTT 
AGGCTGCTGT 
TCAGCATGGA 
GCCCCTGTGT 
ACACCCGGTA 
TCACTATGCA 
ACACCACCGC 
ACCTCAGCTG 
CAGACTCCAG 
CAGAGAAGCA 
TTTTCCTGGG 
ACCTCACCGA 
AGTACTTCTC 
AGCACCTACT 
AGAACAAGCA 
AGGATGCATT 
GATCAGATGA 
CCATCCACAT 
CGAGCGCTTT 
CCTCCCAGGC 
TGCCAGAGAC 
TCTACCTCAA 



GCTTCCCTTC 
CATCCGAGTG 
CTGGGACTGC 
GTCAGTGCCT 
AACTCAGGTT 
CCATGCATTC 
GTGCCTCAAG 
CTTCATGGCC 
6GTGGTGGTA 
TTTATACAGA 
CAGCAGGGTG 
CAGCCCCCCA 
GTCCACCGTT 
CGAGCCACGC 
TCAGAGTCCC 
CCTCCACTGC 
CTATGCTCCT 
CTTGCTGGGG 
CATTCTTCCC 
TTTCTACAAC 
TTACGACCTT 
ACTTCCCCAA 
TGACAGTGAC 
CGGCGTCCTC 
TAGTCACTTG 
CTACATCTAC 
CAACATCCGG 
CACGCTGAGA 
CGGCCTACGA 



1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 
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GACACCTCAO ACTTTOTOOA AOCCKTAOAA AAAGTGAOAO TCATCTCTAA CAACTATAOO 
AGCCTGGGAC TGTCCAGCTA CCCCAATGCC TACCCCTTCC TGTTCTCGOA GCyUVTACATC 
ACCCTGC6CC ACTGGCTGCT GCTATCCATC AGCGTGGTGC TGGCCTGCAC CTTTCTAGTG 
TGCGCACTCT TCCTCCTGAA CCCCTGGACG 6CCGGCATCA TTGTCATGGT CCTGGCTCTG 
ATGACCOTTC AGCTCTTTOO CATOATGGGC CTCATTGOGA TCAAGCTCAO TOCTCTGCCT 
GTGGTCATCC TCATTCCATC TCTTGGCATC 6CA0TCCACT TCACCGTCCA CGTCCCTTTG 
6CCTTTCT0A CAOCCATTGO CGACAAOAAC CACACGGCTA TCCTCGCTCT GGAACACATC 
TTTCCTCCCC TTCTGCACOC TCCTGTGTCC ACTCTGCTGG GTCTACTGAT GCTTGCAGGG 
TCCCAATTTG ATTTCATTCT CAGATACTTC TTTCCCGTCC TGCCCATTCT CACC6TCTTG 
GGGOTTCTCA ATGGACTGCT TCT6CTGCCT GTCCTCTTAT CCTTCTTTGG ACCGTCTCCT 
GAGGTGTCTC CAOCCAATCC CCTAAACCCA CTCCCCACTC CTTCGCCTCA GCCGCCTCCA 
AGTGTCGTCC GGWTGCCGT GCCTCCTCGT CACACGAACA ATGGGTCTCA TTCCTCCGAC 
TC6GAGTACA GCTCTCACAC CACGGTGTCT GGCATCACTG AGGACCTCAG GCAATACCAA 
GCACAGCAG6 GTCCCGGAGG CCCTGCCCAC CAAGTGATTG TGGAAGCCAC ACAAAACCCT 
GTCTTTGCCC OGTCCACTGT GGTCCATCCG GACTCCAGAC ATCAGCCTCC CTTGACCCCT 
CGGCAACAGC CCCACCTGCA CTCTCGCTCC TTGTCCCCTG GACGGCAAGC CCAGCAGCCT 
CGAAGGGATC CCCCTACACA ACCCTTCCCC CCACCCCCCT ACACACCGCG CAGAGACGCT 
TTTGAAATTT CTACTCAAGG GCATTCTGGC CCTAGCAATA GGGACCGCTC AGGGCCCCGT 
GGGGCCCGTT CTCACAACCC TCGGAACCCA ACCTCCACCG CCATGGGCAG CTCTGTGCCC 
AGCTACTGCC AGCCCATCAC CACTCTGACO GCTTCTCCTT CGGTCACTGT TGCTGTGCAT 
CCCCCGCCTG GACCTGGGCG CAACCCCCGA CGGCGGCCCT GTCCAGCCTA TGAGAGCTAC 
CCTGACACTG ATCACGGGGT ATTTCAGGAT CCTCATGTGC CTTTTCATGT CACGTGT6AG 
AGGAGGGACT CAAAGCTGGA GGTCATA6AG CTACAGGACG TGGAATGTGA GGAGAGGCCC 
TGGGGGAGCA GCTCCAACTG AGCGTAATTA AAATCTGAAG CAAAGAGCCC AAAGAXTGGA 
AAGCCCOGCC CCCACCTCTT TCCACAACTG CTTCAAGAGA ACTGCTTC6A ATTATGGGAA 
CGCACTTCAT TCTTACTOTA ACTGATTCTA TTATTKKGTG AAATATTTCT ATAAATATTT 
AARAGGTCTA CACATGTAAT ATACATGGAA ATGCTCTACA GTCTATTTCC TCGCGCCTCT 
CCACTCCTGC CCCAOAGTGC G6A0ACCACA GGGGCCCTTT CCCCTGTGTA CATTGCTCTC 
TCTCCCACAA CCAAOCTTAA CTTAOTTTTA AAAAAAATCT CCCAGCATAT GTCGCT6CTG 



3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 
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CTTAAATATT GTATAATTTA CTTOTATAAT TCTATCCAAA TATTCCTTAT GTAATACGAT 


4800 


TATTTeTAAA GCTTTCTGTT TAAAATATTT TAAATTT6CA TATCACAACC 


CTGTGGTAGG 


4860 


«*/.»aTTeTT ACTOTTAACT TTTGAACACG CTATOCGTGG TAATTCTTTA 


ACGAGCAGAC 


4920 


ATGAAGAAAA CACCTIAATC CCA6TGCCTT CTCTAGGGCT AGTTGTATAT 


GGTTCGCATG 


4980 


GGTGGATGTG TGT6TGCATG TOACTTTCCA ATGTACTOTA TTGTOGTTTC 


TTGTTGTTGT 


5040 


TGCTCTTGrr GTTCATTTTC GTCTTTTTGG TTGCTTTGTA TGATCTTAGC 


TCTGGCCTAG 


5100 


GTCGGCTGGG AAGGTCCAGG TCTTTTTCTG TCGTGATCCT GGTGGAAAGG 


TGACCCCAAT 


5X60 


CATCTGTCCT ATTCTCTGGG ACTATTC 




5187 






(2) INFORMATION FOR SEQ ID N0i4j 







(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1311 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Het Val Ala Pro Asp Ser Glu Ala Pro Ser Asn Pro Arg He Thr Ala 
1 5 10 15 

Ala His Glu ser Pro Cys Ala Thr Glu Ala Arg His Ser Ala Asp Leu 
20 25 

Tvr He Arg Thr Ser Trp Val Asp Ala Ala Leu Ala Leu Ser Glu Leu 
35 40 45 

Glu Lys Gly Asn He Glu Gly Gly Arg Thr Ser Leu Trp He Arg Ala 



50 



55 



Trp Leu Gin Glu Gin Leu Phe He Leu Gly Cys Phe Leu Gin Gly Asp 
65 70 75 

Ala Gly Lys Val Leu Phe Val Ala He Leu Val Leu Ser Thr Phe Cys 
85 

val Gly Leu Lye Ser Ala Gin lie Hie Thr Arg Val Aep Gin Leu Trp 
100 105 i-^" 

val Gin Glu Gly Gly Arg Leu Glu Ala Glu Leu Lye Tyr Thr Ala Gin 



120 



115 

Ala Leu Gly Glu Ala Asp Ser Ser Thr His Gin Leu Val He Gin Thr 
130 



135 
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Ala Lys A.p Pro A.p Val Ser Le« La„ Hie Pro oly Ala Leu Leu ciu 

160 

Hia Leu Lys Val Vai Hi, Ala Ala Thr Arg Val Thr Val Hia Het Tyr 

170 

ASP lie Gl« Trp Arg Leu Ly. Asp l.u Cya Tyr Ser Pro Ser lie Pro 



"5 



Aap Phe Clu Gly Tyr Hi. Hia lie Clu Ser lie He A.p Aan Val He 

205 

Pro cy. M. xi. II. Thr Pro A.p cy. Ph. Trp 01. c,y s.r Ly. 



220 



Leu Leu Cly Pro A.p Tyr Pro He Tyr Val Pro Hi. Leu Ly. Hia Ly. 

240 

Leu Gin Trp Thr Hi. Leu A.n Pro Leu Glu Val Val Glu Glu Val Lye 

255 

Ly. Leu Ly. Phe Gin Phe Pro Leu Ser Thr lie Glu Ala Tyr Met Ly. 

265 270 
Arg Ala Gly He Thr Ser Ala Tyr Met Lya Ly. Pro Cy. Leu A.p Pro 

285 

Thr «p ^ Hi. ^. „^ ^ 

n. pro ».p V.1 „. ,1. ^ ^ 

Ala Ala Tyr Met Hi. Trp Pro Glu Gin Leu lie Val Gly Gly Ala Thr 

335 

Arg A.n Ser Thr Ser Ala Leu Arg Ly. Ala Arg Xaa Leu Gin Thr Val 

345 

V.1 «„ H« «y „. ^ « ^ 

-^^0 365 
T,r v.1 Hi. «n XI. „, I,p „„ «. Ly. „. „. 

Leu ASP Ala Trp Gin Arg Ly. Phe Ala Ala Glu Val Arg Ly. He Thr 

395 

Thr ser Gly Ser Val Ser Ser Ala Tyr Ser Phe Tyr Pro Phe Ser Thr 

410 

ser Thr Leu A.„ A.p He Leu Gly Ly. Phe Ser Glu Val Ser Leu Ly. 

430 

Aan lie He ciy Tyr Met Ph« iioi- ^^ ^ 

435 ^ Ala Val Thr 

445 

Leu He Gin Trp Arg A.p Pro He Arg Ser Gin Ala Gly Val Gly He 
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^Xa Gly val X^u Leu Leu Ser He Thr Val Ala Ala Gly Leu Cly Phe 
465 

cya Ala Leu Leu Cly He Pro Phe Asn Ala Ser Ser T.r Gin Ue Val 
485 



P„ Ph. L.U »1. L.U Ol, L.U Oly V.1 01» "P 



500 



505 



HI. Thr Tyr V.1 «u oln »l. Oly A.P V.1 pro «, =1« OX- Ar, Ihr 

515 "0 
Cly Leu val Leu Lys Lys Ser Gly Leu Ser Val Leu Leu Ala Ser Leu 
530 535 



cys Asn val Met Ala Phe Leu Ala Ala Ala Leu Leu Pro He Pro Ala 
545 550 555 

Phe Arg val Phe eye Leu Gin Ala Ala lie Leu Leu Leu Phe Asn Leu 

565 570 

Cly ser He Leu Leu Val Phe Pro Ala Met He Ser Leu Asp Leu Arg 



580 585 



ser Ala Ala Ar, Ala Asp Leu Leu Cys Cys Leu Met Pro Glu 
595 



ser pro Leu Pro Ly 
610 



8 Lye Lys He Pro Glu Arg Ala Lys Thr Arg Lys 



615 



620 



^sn ASP Lys Thr Hi. Arg He Asp Thr Thr Arg Gin Pro Leu Asp Pro 
625 

ASP val ser Glu Asn Val Thr Lys Thr Cy« ^« |S 
645 

Thr Ly. Trp Al. Ly. «. ol» Tyr «. Pro Ph. .1. K.t Ar, Pro »U 

660 

v.1 Ly. v.1 Thr s.r M.« L.» »1. L» II. M. v.1 ne ^ Thr s.r 

675 

val Trp Gly Ala Thr Lys V.1 Lys Asp Gly Leu Asp Leu Thr Asp He 
690 

val pro Glu Asn Thr Asp Glu His Glu Phe Leu Ser Arg Gin Glu Lys 
705 

Tyr Phe Gly Phe Tyr Asn Met Tyr Ala Val Thr Gin Gly Asn Phe Glu 

725 

Tyr pro Thr «. oln Ly. L«. U» Tyr Olu Tyr »i. «P «» Ph. v.1 

740 

Arg n pro Asn He He Lys Asn Asp Asn Gly Gly Leu Thr Lys Phe 
755 
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Trp L«u ser Leu Phe Arg Aap Trp Leu Leu Asp Leu Gin Val Ala Phe 

775 780 

Asp Lys Glu val Ala Ser Gly Cya He Thr Gin Glu Tyr Trp Cya Lys 
. 795 800 

Aan Ala Ser Asp Glu Gly He Leu Ala Tyr Lye Leu Net Val Gin Thr 

810 815 

Oly His Val Asp Asn Pro He Asp Lys Ser Leu He Thr Ala Gly His 
820 825 830 

Arg Leu Val Asp Lys Asp Gly He He Asn Pro Lys Ala Phe Tyr Asn 

840 845 

Tyr Leu Ser Ala Trp Ala Thr Asn Asp Ala Leu Ala Tyr Gly Ala Ser 

855 860 

Gin Gly Asn Leu Lys Pro Gin Pro Gin Arg Trp He His Ser Pro Glu 

870 875 880 

Asp Val His Leu Glu He Lye Lys Ser Ser Pro Leu He Tyr Thr Gin 
885 890 895 

Leu Pro Phe Tyr Leu Ser Gly Leu Ser Asp Thr Xaa ser He Lys Thr 
^00 905 

Leu He Arg Ser Val Arg Asp Leu Cys Leu Lys Tyr Glu Ala Lys Gly 
915 920 ' 

Leu Pro Asn Phe Pro Ser Gly He Pro Phe Leu Phe Trp Glu Gin Tyr 
»"»0 935 940 

Leu Tyr Leu Arg Thr Ser Leu Leu Leu Ala Leu Ala Cy. Ala Leu Ala 

950 955 

Ala Val Phe He Ala Val Met Val Leu Leu Leu Asn Ala Trp Ala Ala 
965 970 975 

val Leu Val Thr Leu Ala Leu Ala Thr Leu Val Leu Gin Leu Leu Gly 
980 985 990 

val Met Ala Leu Leu Gly Val Lys Leu ser Ala Met Pro Ala Val Leu 
995 1000 1005 

^nJo^*" ^^'^ P*** Thr val His Leu cys 

"10 1015 1020 

teu Gly Phe Val Thr Ser He Gly cys Lys Arg Arg Arg Ala Ser Leu 

1030 1035 1040 

Ala Leu Glu Ser Val Leu Ala Pro Val Val His Gly Ala Leu Ala Ala 
1045 1050 1055 

Ala Leu Ala Ala. Ser Met Leu Ala Ala Ser Glu Cys Gly Phe Val Ala 
1060 1065 1070 

Arg Leu Phe Leu Arg Leu Leu Leu Asp He Val Phe Leu Gly Leu H 
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1075 



1080 1085 



A8P Cly Leu Leu Phe Phe Pro He Val Leu Ser He Leu 6ly Pro Ala 
1090 1095 1100 

Ala Glu val Arg Pro He Glu Hie Pro Glu Arg Leu ser Thr Pro Ser 
1105 1110 1"" 

pro Lye Cye Ser Pro He His Pro Arg Lys Ser Ser Ser Ser Ser Gly 
' 1125 1130 1135 

Gly Gly Aep Lye Ser ser Arg Thr Ser Lya Ser Ala Pro Arg Pro Cys 
1140 1145 llbO 

Ala Pro ser Leu Thr Thr lie Thr Glu Glu Pro Ser Ser Trp His Ser 
1155 1160 1165 

ser Ala His Ser Val Gin Ser Ser Met Gin Ser He Val Val Gin Pro 
1170 1"S 11«0 



Glu 



val val val Glu Thr Thr Thr Tyr Asn Gly Ser Asp Ser Ala Ser 



1185 1190 



1195 1200 



Gly Arg Ser Thr Pro Thr Lys Ser Ser His Gly Gly Ala He Thr Thr 
' 1205 1210 1215 

Thr Lys Val Thr Ala Thr Ala Asn He Lys Val Glu Val Val Thr Pro 
1220 1225 1230 

ser ASP Arg Lys Ser Arg Arg Ser Tyr His Tyr Tyr Asp Arg Arg Arg 
1235 1240 1245 

Asp Arg Asp Glu Asp Arg Asp Arg Asp Arg Glu Arg Asp Arg Asp Arg 
1250 1255 1260 

Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg^ 
1265 1270 1275 

Glu Arg Ser Arg Glu Arg Asp Arg Arg Asp Arg Tyr Arg Asp Gl" Arg 
1285 1290 ^^^^ 

ASP His Arg Ala Ser Pro Arg Glu Lys Arg Gin Arg Phe Trp Thr 
1300 1305 1310 

(2) INFORMATION FOR SBQ ID HO: 5: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 4434 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 5: 
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CCAAACAACA GAOCCAOTOA OACTACOCAO AGCGTCTCTO TTCTCTOTTO ACTGTCOCCC 
ACGCACACAC CCCCAAAACA GTGCACACAG ACOCCCCCTO GCCAACACAC AGTGAGAGAG 
AGAAACAGCG GtX;CGCGCTC GCCTAATCAA GrTCTTCGCC TGGCTGGCGT CCCCCATCCA 
COAGATACAO ATACATCTCT CATGOACCGC GACAGCCTCC CACCOCTTCC GGACACACAC 
GGCGAT6TGG TCGATOACAA ATTATTCTCO GATCTTTACA TACCCACCAG CTCGGTGGAC 
GCCCAACTCO CGCTCGATCA GATAGATAAG CGCAAAGCGC GTGGCACCC6 CAOGGCGATC 
TATCTGCGAT CAGTATTCCA GTCCCACCTC GAAACCCTOG GCACCTCCXST GCAAAAGCAC 
GCGGGCAAGG TGCTATTCGT GGCTATCCTO GTGCTCAGCA CCTTCTGCCT CGGCCTGAAC 
AGCGCCCAGA TCCACTCCAA GGTCCACCAO CTGTGGATCC AGGAGGCCGG CCCGCTGGAG 
GCGGAACTGG CCTACACACA GAAGACGATC CGCGAGGACG AGTCGGCCAC CCATCAGCTC 
CTCATTCAGA CGACCCACGA CCCGAAOGCC TCCGTCCTGC ATCCGCAGGC GCTGCTTGCC 
CACCTGGAGG TCCTGGTCAA GGCCACCGCC GTCAAGGTGC ACCTCTACGA CACCOAATGG 
GGGCTGCGCG ACATGTGCAA CATGCCGAGC AOGCCCTCCT TCGAGGGCAT CTACTACATC 
GAGCAGATCC TGCGCCACCT CATTCCGTGC TOGATCATCA CGCCGCTGCA CTCTTTCTGG 
GAGGGAAGCC AGCTGTTGGG TCCGCAATCA GCGGTCGTTA TACCAGGCCT CAACCAACGA 
CTCCTGTGGA CCACCCT«AA TCCOGCCTCT GTGATGCAGT ATATGAAACA AAAGATGTCC 
GAGGAAAAGA TCAGCTTCGA CTTCCAGACC GT6GAGCAGT ACATGAAGCG TGOGGCCATT 
CGCAGTGGCT ACATGGAGAA GCCCTGCCTG AACCCACTGA ATCCCAATTG CCCGGACACG 
GCACCGAACA AGAACAGCAC CCAGCCGCOG GATGTGGGAG CCATCclrGTC CGGACGCTCC 
TACX^GTTATG CCGCGAAOCA CAT«CACTGG CCCGAGGAGC TGATTGTCGG CGGACGGAAG 
AGGAACOGCA GCCGACACTT GAGGAAGGCC CAGGCCCTGC AGTCCGTGGT GCAGCTGATG 
ACCGAGAAGG AAATGTACGA CCAGTGOCAC GACAACTACA AGCTGCACCA TCTTGGATCG 
ACXJCAGGAGA AGGCAGCGCA GCTTTTCAAC GCCTGGCAGC GCAACTTTTC GCGGGAGGT« 
GAACAGCTGC TAOGTAAACA GTCGAGAATT GCCACCAACT ACGATATCTA CGTGTTCAGC 
TCGGCTGCAC TGGATGACAT CCTCGCCAAG TTCTCCCATC CCACCGCCTT GTCCATTGTC 
ATCCGCGTGO CCGTCACCGT TTTCTATGCC TTTTGCACGC TCCTCCGCTG GAGGGACCCC 
CTCCGTGGCC AGAGCAG,«T C6GCCTCCCC GGAGTTCTGC TCATCTGCTT CAGTACCGCC 
GCOGGATTGG GATTCTCAGC OCTGCTCGGT ATCGTTTTCA ATGOGCTCAC CCCTGCCTAT 
GCGGAGAGCA ATCGGCGGCA CCAGACCAAG CTGATTCTCA AGAACCCCAG CACCCAGCTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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GTTCCGTTTT 
CTGTTCAGTG 
GCTTTGAAGG 
CTATTGGTTT 
GACATCTTCrr 
CTGCCGCTGA 
AGGGTGCCGC 
AGTCACTCAC 
CTCATGCCCA 
AGCTTGTATG 
GACAGCAACG 
TATGCGGTTA 
CATGATTCCT 
TTCTGGCTGC 
TACCGCGACG 
CTGGCCTACA 
GTGCTCACCA 
TATCTGTCGG 
TATCCGGAAC 
AGTCTGCCAT 
CAGATCAAGA 
CTGCCCAACT 
TCCTCACTGG 
CTCCTGCTCT 
CAGATCTTTG 
CTCATCCTCA 
ACATCCGTTG 
CTTGTCCACG 
GAGTTTGTGA 



TCGCCCTTGG 
CCTGCAGCAC 
TATTCTGTCT 
TTCCGGCCAT 
GCTGCTGTTT 
ACAACAACAA 
TGCCCGCCGA 
TGCCCTCCTT 
GCTGGGTGAA 
CCTCCACXXX5 
A6CACAAGTT 
CCCAGGGCAA 
TTGTGOGCGT 
TGCTCTTCAG 
GACGGCT6AC 
AGCTAATCGT 
ATCGCCTGCT 
CATGGGCCAC 
CGCGCCAGTA 
TGGTCTACGC 
CCCTGATAGG 
ATCCATCGGG 
CCAT6ATCCT 
CCGTTTGCGC 
GGGCCATGAC 
GCGTGGGCAT 
GGAACCQACA 
6CATGCTGAC 
TCCGGCACTT 



TCTGGOCGTC 
CGCAGGATCC 
GCAGGCTGCC 
GATTTOGTTG 
TCCGGTGTGG 
CGGGCGOGGG 
GAATCCTCTG 
CTCCCTGGCA 
GTTCCTCACC 
CCTTCAGGAT 
CCTGGATGCT 
CTTTGAATAT 
GCCACATCTG 
CGAGTGGCTG 
CAAGGAGTGC 
GCAAACCGGC 
CAACAGCGAT 
CAACGACGTC 
TTTTCACCAA 
TCAGATGCCC 
TCATATTCGC 
CATTCCCTTC 
GGCCTGCGTG 
CGCCGTTCTC 
TCTGCTGGGC 
GATGCTGTGC 
GCGCCGCGTC 
CTCCGGAGTG 
CTGCTGGCTT 



GATCACATCT 
TTCTTTGCGG 
ATCGTAATGT 
GATCTACGGA 
AAGGAACAGC 
GCCCGGCATC 
CTGGAACAGA 
ACCTTCGCCT 
GTTATGCGTT 
GGCCTGGACA 
CAAACTCGCC 
CCCACCCAGC 
ATCAAGAATG 
CGTAATCTGC 
TGGTTCCCAA 
CATGTGCACA 
GGCATCATCA 
TTCGCCTACG 
CCCAACGAGT 
TTTTACCTCC 
GACCTGACCG 
ATCTTCTGGG 
CTACTCGCCG 
GTGATCCTCA 
ATCAAACTCT 
TTCAATCTGC 
GAGCTGAGCA 
GCCGTGTTCA 
CTGCTGGTGG 



TCATAGTGGG 
CCGCCTTTAT 
GCTCCAATTT 
GACGTACC6C 
CGAAGGTGGC 
CGAAGAGCTG 
GGGCAGACAT 
TTCAGCACTA 
TCCTGGCGGC 
TTATTGATCT 
TCTTTGGCTT 
AGCAGTTGCT 
ATAACGGTGG 
AAAAGATATT 
ACGCCAGCAG 
ACCCCGTCGA 
ACCAACGCGC 
GAGCTTCTCA 
ACCATCTTAA 
ACGGACTAAC 
TCAAGTACGA 
AGCAGTACAT 
CCCTGCTGCT 
CCGTTCTGGC 
CGGCCATTCC 
TGATATCACT 
TGCAGATGTC 
TGCTCTCCAC 
TCTTATGCGT 



ACCGAGCATC 
TCOGGTGCCG 
GGCAGCGCCT 
CGGCAGGGCG 
ACCTCCGGTG 
CAACAACAAC 
CCCTGGCACC 
CACTCCCTTC 
eCTCATATCC 
GGTGCCCAAG 
CTACAGCATG 
CAGGGACTAC 
ACTGCCGGAC 
CGACGAGGAA 
CCATGCCATC 
CAAGGAACTG 
CTTCTACAAC 
GGGCAAATTG 
GATACCCAAG 
AGATACCTCG 
GGGCTTCGGC 
CACCCTGCGC 
GGTCTCCCTG 
CTCGCTCGCC 
GGCAGTCATA 
CCGCTTCATG 
CCTGGGACCA 
CTCGCCCTTT 
TGGCGCCTGC 



1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 
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AACAGCCTTT TGGTCTTCCC CaTCCTACTG AGCATGCTGG GACCGGAGGC GGAGCTGGTG 
CCGCTGGAGC ATCCAGACCG CATATCCUVCG CCCTCTCCGC TGCCCGTGCC CAGCAGCAAG 
AGATCGGGCA AATCCTATCT GCTGCaGCCA TCCCGATCCT CGCGAGGCAG CTGCCAGAAG 
TCGC»TCACC ACCACCACAA AGACCTTAAT OATCCATCCC TGACGACGAT CACCGAGCAG 
COGCAGTCGT GGAAGTCCAG CAACTCCTCC ATCCAGATGC CCAATGATTG GACCTACOIG 
CCGCGCGAAC AGCGACCCCC CTCCTACCCO CCCCCOCCCC CCGCCTATCA CAAGCCCOCC 
OCCCAGCAGC ACCACCACCA TCaiOGGCCCC CCC»CAACGC CCCCGCCTCC CTTCCCGACG 
GCCTATCCGC CGGAGCTCCA GACCATC6TG GTGCAGCCGG AGGTGACCGT GCAGACGACG 
CACTCGGACA GCAACACC»C CAAGGTCACO GCCACGGCCA ACATCAAGGT GGAOCTGCCC 
ATGCCCCCCA GOCCGGTGCG CAGCTAIAAC TTTACGACTT AGCACTACCA CTAGTTCCTG 
TAGCTATTAG GAC6TATCTT TAOACTCTAG CCTAAGCCGT AACCCTATTT GTATCTGTAA 
AATCGATTTG TCCAGCCGGT CTGCTGAGGA TTTCGTTCTC ATGGATTCTC ATGGATTCTC 
ATGGATGCTT AAATGGCATG GTAATTGGCA AAATATCAAT TTTTGTGTCT CAAAAAOATG 
CATTAGCTTA TGGTTTCAAG ATACATTTTT AAAGAGTCCG CCAGATATTT ATATAAAAAA 
AATCCAAAAT CGACGTATCC ATGAAAATTG AAAACCTAAG CAGACCCGTA TGTATGTATA 
TGTGTATGCA TGTTAGTTAA TTTCCCXSAAG TCCGGTATTT ATAGCAGCTG CCTT 
(2) INFORMATION FOR SKQ 10 NOt6t 

(i) SEQCJENCB C3IARACTERISTICSt 

(A) LENGTH: 1285 amino acids 

(B) TCPKi amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DBSCRIPTlONj SBQ ID N0>6: 

Met Asp Arg Asp Ser Leu Pro Arg Val Pro Asp Thr His Oly Asp Val 
^5 10 15 

val Asp Glu Lya Leu Phe Ser Asp Leu Tyr He Arg Thr Ser Trp Val 
20 — ^ 



25 



30 



Asp Ala Gin val Ala Leu Asp Gin He Asp Lys Gly Lys Ala Arg Oly 



40 



45 



3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4434 



Ser Arg Thr Ala lie Tyr Leu Arg Ser Val Ph Gin Ser HIb Leu Glu 

55 60 
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Thr Leu Gly Ser Ser V.l Gin Ly. Hi- Ala Cly Lya Val Leu Phe Val 
65 

Ala lie Leu val Leu Ser Thr Phe cys Val Gly Leu Lya Ser Ala Gin 
85 

lie Hi. ser Lya Val His Gin Leu Trp He Gin Glu Gly Gly Arg Leu 



100 



Glu Ala Glu Leu Ala Tyr Thr Gin Lya Thr He Gly Glu Asp Glu Ser 



115 



Ala Thr His Gin Leu Leu lie Gin Thr Thr His Asp Pro Asn Ala Ser 



130 



135 



val Leu His Pro Gin Ala Leu Leu Ala Hia Leu Glu Val Leu Val Lya 
145 "0 "5 

Ala Thr Ala Val Ly. Val Hi. Leu Tyr A.p Thr Glu Trp Gly Leu Arg 

165 

ASP Met cy. A.n Met Pro Ser Thr Pro Ser Phe Glu Gly lie Tyr Tyr 
180 

lie Glu Gin lie Leu Arg Hia Leu He Pro Cya Ser lie He Thr Pro 



195 



Leu ASP cys Phe Trp Glu Gly Ser Gin Leu Leu Gly Pro Glu Ser Ala 



210 



215 220 



val val lie Pro Gly Leu Asn Gin Arg Leu Leu Trp Thr Thr Leu Asn 



225 



230 



pro Ala ser Val Met Gin Tyr Met Lys Gin Lys Met Ser Glu Glu Lys 
245 250 

lie ser Phe Asp Phe Glu Thr Val Glu Gin Tyr Met Lys Arg Ala Ala 
260 265 

lie Gly ser Gly Tyr Met Glu Ly. Pro Cys Leu Asn Pro Leu Asn Pro 

275 280 
Asn cys pro Asp Thr Ala Pro Asn Ly. Asn Ser Thr Gin Pro Pro Asp 

290 295 
val Cly Ala He Leu Ser Gly Gly Cy. Tyr Gly Tyr Ala Ala Ly. Hi. 



305 



310 315 



Met Hi. Trp Pro Glu Glu Leu He Val Gly Gly Arg Ly. Arg A.n Arg 



325 



ser Gly Hi. Leu Arg Ly. Ala Gin Ala Leu Gin Ser Val Val Gin Leu 



340 



345 



Glu Met Tyr Asp Gin Trp Gin Asp Asn Tyr Lys Val 



Met Thr Glu Lys Glu Met Tyr nap ..f - 

355 

Hi. His Leu Gly Trp Thr Gin Glu Ly. Ala Ala Glu Val Leu A.n Ala 
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Trp cm Arg Aen Ph Ser Arg ciu Val Glu Gin r.eu Lau Arg Lys Gin 



400 

ser Arg lie Ala Thr Aon Tyr Asp He Tyr Val Phe Ser Ser Ala Ala 

410 

I-u A.p ASP lie Leu Ala Ly. Ph. Ser Hie Pro Ser Ala Leu Ser He 



420 42S 



val lie Gly Val Ala Val Thr Val Leu Tyr Ala Phe Cy. Thr Leu Leu 

440 445 

Arg Trp Arg Asp Pro Val Arg Gly Gin Ser Ser Val Gly Val Ala 



«5 440 445 

450 ' 45I ^"^ val Ala Gly 

Val Leu Leu Met Cye Phe Ser Thr Ala Ala Gly Leu Gly Leu Ser Ala 

^'^^ 480 
Le» Leu Gly 11. val Phe A.n Ala Leu Thr Ala Ala Tyr Ala Glu Ser 



490 



495 



Asn Arg Arg Glu Gin Thr Lye Leu lie Leu Lye Aan Ala Ser Thr Gin 



505 



510 



val val Pro Phe Leu Ala Leu Gly Leu Gly Val Asp His He Phe He 

525 



val Gly Pro Ser He Leu Phe Ser Ala Cys ser Thr Ala Gly ser Phe 

540 

Phe Ala Ala Ala Phe lie Pro Val Pro Ala Leu Lys Val Phe Cys Leu 

560 

Gin Ala Ala He v.l Met Cy. Ser Asn Leu Ala Ala Ala Leu Leu Val 

575 

Phe pro «ej n. s„ ».p ^ ^ ^ „^ 

590 

Ala ASP lie Phe Cys Cys Cy. Phe Pro Val Trp Ly. Glu Gin Pro Lys 

605 

val Ala Pro Pro Val Leu Pro Leu Asn Asn Asn Asn Gly Arg Gly Ala 

g20 

Arg His Pro Ly. Ser Cys Asn Asn Asn Arg Val Pro Leu Pro Ala Gin 

635 640 
Asn Pro Leu Leu Glu Gin Arg Ala Asp He Pro Gly Ser Ser His Ser 

650 655 
Leu Ala ser Phe Ser Leu Ala Thr Ph Ala Phe Gin Hi. Tyr Thr Pro 



665 



670 



Ph Leu Met Arg Ser Trp V.l Ly. Phe Leu Thr Val Met Gly Ph. Leu 

680 665 
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Ti- ser ser Leu Tyr Ala Ser Thr Arg Leu Gin Asp Gly 
Ala Ala Leu He Ser ser ^e y 

690 

Tie II ABp Leu Val Pro Lye Asp Ser Aen Glu Hie Lys Phe 
Leu Asp He H Asp j-«u 72o 

705 

T>K*. riv Phe Tvr Ser Met Tyr Ala Val 
Leu ABP Ala Gin Thr Arg I..u Phe Gly Phe Tyr 

725 

. olu Tyr pro Thr Gin Gin Gin I^u Leu Arg Asp 

Thr Gin Gly Aen Phe Glu Tyr ifr 
740 

w « „ v.i Pro HiB Val lie Lys Aan Asp Aon Gly 
Tyr His ABP ser Phe Arg Val Pro hib 

755 

. Phe Trp Leu Leu Leu Phe Ser Glu Trp Leu Gly Aan 
Gly Leu Pro Asp Phe Trp Leu i- 

770 

u. p« 0- ni -° 

785 

« « Ai- ser ser Asp Ala He Leu Ala Tyr Lys 
Phe Pro Aan Ala Ser ser 



Glu cys Trp Phe rro «-» 

805 

, elv HiB val Asp Asn Pro Val Asp Lys Glu Leu 

Leu He Val Gin Thr Gly HiB vai « ^ 
820 

v.. L.U ™. »n v.. »« S.. «P o.. X- - - "» 

835 

«. T-.., ^er Ala Trp Ala Thr Aan Asp Val Phe Ala 
Ala Phe Tyr Aen Tyr Leu Ser Ala Trp 

850 

01, M. s-r Oln «, L-« Tyr .ro clu P« «, ol„ Tyr PJ. 



865 

H,. «n pro »n «o Tyr ^ ^ »J '« ^ ^ 

885 

v.. Tyr H« ^' "5 

900 ^ 

Ue .y. T» XX. C.y HU XX. »P - - v.X .y. Tyr 

ox„ cxy "y x^ pro «n Tyr Pro s.r OXy XX. Pro P>» XXe P.. 
930 

,rp 0X» <=x» Tyr H.t T« «u s.r «r «. Hot XXe «. 

ft>ic 950 ^""^ 

945 

Cy. V.X X^ X-U M. - "° «S 

^ 965 

V.X Trp AX. »X. V.X V.X XX. ^- ser V.X X^ - 
980 
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995 1000 



1005 



Pro Ala Val lie Leu He Leu Ser Val Gly Met Met Leu Cys Phe Asn 
1010 1015 1020 

Val Leu He Ser Leu Gly Phe Met Thr Ser Val Gly Asn Arg Gin Arq 
1025 1030 1035 io40 

Arg Val Gin Leu Ser Met Gin Met Ser Leu Gly Pro Leu Val Hie Gly 
1045 1050 1055 

Met Leu Thr Ser Gly Val Ala Val Phe Met Leu Ser Thr Ser Pro Phe 
1060 1065 1070 

Glu Phe Val He Arg His Phe Cya Trp Leu Leu Leu Val Val Leu Cys 
1075 1080 1085 

Val Gly Ala Cys Asn Ser Leu Leu Val Phe Pro He Leu Leu Ser Met 
1090 1095 1100 

Val Gly Pro Glu Ala Glu Leu Val Pro Leu Glu His Pro Asp Arg He 
1105 1110 1115 1120 

Ser Thr Pro Ser Pro Leu Pro Val Arg Ser Ser Lys Arg Ser Gly Lys 
1125 1130 1135 

Ser Tyr Val Val Gin Gly Ser Arg Ser Ser Arg Gly Ser Cys Gin Lye 
1140 1145 1150 

Ser His His His His His Lys Asp Leu Asn Asp Pro Ser Leu Thr Thr 
1155 1160 1165 

He Thr Glu Glu Pro Gin Ser Trp Lys Ser Ser Asn Ser Ser He Gin 
1170 1175 1180 

Met Pro Asn Asp Trp Thr Tyr Gin Pro Arg Glu Gin Arg Pro Ala Ser 
^"5 1190 1195 1200 

Tyr Ala Ala Pro Pro Pro Ala Tyr His Lys Ala Ala Ala Gin Gin His 
1205 1210 1215 

His Gin His Gin Gly Pro Pro Thr Thr Pro Pro Pro Pro Phe Pro Thr 
1220 1225 1230 

Ala Tyr Pro Pro Glu Leu Gin Ser He Val Val Gin Pro Glu Val Thr 
1235 1240 1245 

Val Glu Thr Thr His Ser Asp Ser Asn Thr Thr Lys Val Thr Ala Thr 
1250 1255 1260 

Ala Asn He Lys Val Glu Leu Ala Met Pro Gly Arg Ala Val Arg Ser 
1265 1270 1275 1280 

Tyr Asn Phe Thr Ser 
1285 

(2) INFORMATION FOR SEQ ID N0:7i 
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(i) SEQUENCE CHARACTERISTICS 5 

(A) LENGTH: 345 bas pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 
AAGCTCCATC AGCTTTGGAT ACAGGAAGGT GGTTCGCTCG AGCATGAGCT AGCCTACACG 
CAGAAATCGC TCGGCGAGAT GGACTCCTCC ACCCACCAGC TGCTAATCCA AACNCCCAAA 
GATATGGACO CCTCGATACT GCACCCGAAC GCGCTACTGA CGCACCTGGA CGTGGTGAAG 
AAAGCGATCT CGGTGACCGT GCACATGTAC GACATCACGT GGAGNCTCAA GGACATGTGC 
TACTCGCCCA GCATACCGAG NTTCGATACG CACTTTATCG AGCAGATCTT CGAGAACATC 
ATACCGTGCG CGATCATCAC GCCGCTGGAT TGCTTTTGGG AGGGA 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



60 
120 
180 
240 
300 
345 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

Lys val His Gin Leu Trp He Gin Glu Gly Gly Ser Leu Glu His Glu 
1 5 

Leu Ala Tyr Thr Gin Lys Ser Leu Gly Glu Met Asp Ser Ser Thr His 
20 25 30 



Gin Leu Leu He Gin Thr Pro Lys Asp Met Asp Ala Ser He Leu His 
35 40 « 

Pro Asn Ala Leu Leu Thr His Leu Asp Val Val Lys Lys Ala He Ser 
50 55 60 

val Thr val His Met Tyr Asp He Thr Trp Xaa Leu Lys Asp Met Cys 

Tyr ser Pro Ser He Pro Xaa Ph Asp Thr His Phe He Glu Gin He 
^ 90 



85 
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Pho Glu Asn lie II Pro Cys Ala 
100 

Trp Glu Gly 
115 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH: 5187 hmmm pairs 

(B) TYPSt nucleic acid 

(C) STRANDEONESSz a ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



lie He Thr Pro Leu Aap Cys Phe 
105 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GGGTCTGTCA CCCCGACCCG CAGTCCCCGG CCGCCAGCAG CGTCCTCCCG AGCCGAGC6C 
CCAGGCGCGC CCGGAGCCCG CGCCCGCGGC CGCAACATGG CCTCGGCTGG TAACGCCGCC 
GCGCCCCTGG GCAGGCAGGC CGGOGGCGGG AGGCGCAGAC GGACCGGGGG ACCGCACCGC 
GCCGCGCCGC ACCGGGACTA TCTGCACCGG CCCAGCTACT GCGACGCCGC CTTCGCTCTG 
GAGCAGATTT CCAAGGCGAA CGCTACTGGC CGGAAAGCGC CGCTGTGGCT GAGAGCGAAG 
TTTCAGAGAC TCTTATTTAA ACTGGGTTGT TACATTCAAA AGAACTGOGG CAAGTTTTTG 
GTTCTGGCTC TCCTCATATT TCCGGCCTTC GCTGTGGCAT TAAAGGCACC TAATCTCGAG 
ACCAACCTGG AGGAGCTCTC 6GTGGAAGTT CGTGGACGAG TGAGTCGACA ATTAAATTAT 
ACCCGTCAGA AGATAGCAGA AGAGCCTATG TTTAATCCTC AACTCATGAT ACAGACTCCA 
AAAGAACAAG GCGCTAATGT TCTCACCACA GAGGCTCTCC TGCAACACCT GGACTCAGCA 
CTCCAGGCCA GTCGTGTCCA CGTCTACATG TATAACAGGC AATGGAAGTT GGAACATTTG 
TGCTACAAAT CACGGGAACT TATCACGGAG ACAGGTTACA TGGATCAGAT AATAGAATAC 
CTTTACCCTT GCTTAATCAT TACACCTTTG GACTGCTTCT GGGAAGGGGC AAAGCTACAG 
TCCGGGACAG CATACCTCCT AGGTAAGCCT CCTTTACGGT GGACAAACTT TGACCCCTTG 
GAATTCCTAG AACAGTTAAA GAAAATAAAC TACCAAGTGG ACAGCTGGCA CGAAATGCTG 
AATAAAGCCG AAGTTGCCCA TGGGTACATG CACCGCCCTT GCCTCAACCC ACCCGACCCA 
GATTGCCCTG CCACAGCCCC TAACAAAAAT TCAACCAAAC CTCTTGATGT GGCCCTTGTT 
TTGAATCCTC CATCTCAAGG TTTATCCAGC AAGTATATGC ATTGCCACCA GCAGTTGATT 
GTGGGTGCTA CCCTCAAGAA TCCCACTGGA AAACTTGTCA GCCCTCACGC CCTCCAAACC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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ATGTTCCAGT TAATCACTCC CAAGCAAATC TATGAACACT TCACGGGCTA CGACTATGTC 
TCTCACATCA ACTGGAATGA AGACAGCGCA CCCGCCATCC TGGAGGCCTG GCAGAGGACT 
TACGTGGAGG TGCTTCATCA AAGTGTCGCC CCAAACTCCA CTCAAAAGGT GCTTCCCTTC 
ACAACCACGA CCCTGGACGA CATCCTARAA TCCTTCTCTG ATGTCAGTGT CATCCGAGTG 
CCCAGCGGCT ACCTACTGAT GCTTGCCTAI CCCTGTTTAA CCATGCTCCG CTGGCACTGC 
TCCAAGTCCC AGGGTGCCGT GGCGCTCGCT GGCGTCCTGT TGGTTGCGCT GTCACTGGCT 
GCAGGATTGG GCCTCTGCTC CTTGATTGGC ATTTCTTTTA ATGCTGCGAC AACTCAGGTT 
TTGCCGTTTC TTGCTCTTGG TGTTGGTGTG CATGATGTCT TCCTCCTGGC CCATGCATTC 
AGTGAAACAG GACAGAATAA GAGGATTCCA TTTGAGGACA GGACTGGGGA GTGCCTCAAG 
CGCACCGGAG CCAGCGTGGC CCTCACCTCC ATCAGCAATG TCACCGCCTT CTTCATGGCC 
GCATTGATCC CTATCCCTGC CCTGCGAGCC TTCTCCCTCC AGGCTGCTGT GGTGGTGGTA 
TTCAATTTTG CTATGGTTCT GCTCATTTTT CCTGCAATTC TCAGCATGGA TTTATACAGA 
CGTGAGGACA GAAGATTGGA TATTTTCTGC TGTTTCACAA GCCCCTGTGT CAGCAGGGTG 
ATTCAAGTTG AGCCACAGGC CTACACAGAG CCTCACAGTA ACACCCGGTA CAGCCCCCCA 
CCCCCATACA CCAGCCACAG CTTCGCCCAC GAAACCCATA TCACTATGCA GTCCACCGTT 
CAGCTCCGCA CAGAGTATGA CCCTCACACG CACGTGTACT ACACCACCGC CGAGCCACGC 
TCTGAGATCT CTGTACACCC TGTTACCGTC ACCCAGGACA ACCTCAGCTG TCAGAGTCCC 
GAGAGCACCA GCTCTACCAG GGACCTGCTC TCCCAGTTCT CAGACTCCAG CCTCCACTGC 
CTCGAGCCCC CCTGCACCAA GT6GACACTC TCTTCGTTTG CAGAGAACCA CTATGCTCCT 
TTCCTCCTGA AACCCAAACC CAAGGTTGTG GTAATCCTTC TTTTCCTGGG CTTGCTGGGG 
GTCAGCCTTT ATGGGACCAC CCGAGTGAGA GACGGGCTGG ACCTCACGGA CATTGTTCCC 
CGGGAAACCA GAGAATATGA CTTCATAGCT GCCCAGTTCA AGTACTTCTC TTTCTACAAC 
ATGTATATAG TCACCCAGAA AGCAGACTAC CCGAATATCC AGCACCTACT TTACGACCTT 
CATAAGACTT TCAGCAATGT GAAGTATGTC ATGCTGGAGG AGAACAAGCA ACTTCCCCAA 
ATGTGGCTGC ACTACTTTAG AGACTGGCTT CAAGGACTTC AGGATGCATT TGACAGTGAC 
TGCGAAACTG GGAGGATCAT GCCAAACAAT TATAAAAATG GATCAGATGA CGGGGTCCTC 
GCTTACAAAC TCCTGGTGCA GACTGGCAGC OGAGACAAGC CCATCGACAT TAGTCAGTTG 
ACTAAACAGC GTCTGGTAGA CGCA6ATGGC ATCATTAATC CGAGCGCTTT CTACATCTAC 
CTGACCGCTT GGGTCAGCAA CGACCXrTGTA GCTTACGCTG CCTCCCAGGC CAACATCCGG 
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carrcAccocc cxwaotccct ccatoacaaa gccoactac» tcccaoagac caggctgaca 2940 

ATCCCAGCAG CAOAGCCCAT CGA6TACGCT CAGTTCCCTT TCTACCTCAA CGCCCTACGA 3000 

GACACCTCAG ACTTTGTGGA AGCCATACAA AAAGTGAGAG TCATCTGTAA CAACTATACG 3060 

AGCCTGGGAC TCTCCAGCTA CCCCAATGGC TACCCCTTCC TGTTCTGGCA GCAATACATC 3120 

AGCCTGCGCC ACTGGCTCCT GCTATCCATC AOOCTOGTGC TGGCCTOCaVC GTTTCTACTG 3180 

TOCGCAGTCT TCCTCCTGAA CCCCTGOACG GCOOGCATCA TTGTCATOGT CCTGCCTCTC 3240 

ATGACCGTTO AGCTCTTTGC CATGAT0G6C CTCATTCGGA TCAA0CT6A6 TGCTOTGCCT 3300 

6TGGTCATCC TGATTGCATC TCTTOGCATC CGAGTGGAGT TCACCGTCCA CGTGGCTTTG 3360 

GCCTTTCTGA CAGCCATTGG GGACAACAAC C»CaGGCCTA TGCTCGCTCT GGAACACATG 3420 

TTTGCTCCCG TTCTGGACGG TGCTGTGTCC ACTCTGCTCC GT6TACTGAT GCTTGCAGGC 3480 

TCCGAATTTG ATTTCATTGT CAGATACTTC TTTGCCGTCC TGGCCATTCT CACCGTCTTG 3540 

GGGGTTCTCA ATGGACTGGT TCTGCTGCCT GTCCTCTTAT CCTTCTTTOC ACCGTGTCCT 3600 

GAGGTGTCTC CAGCCAATGO CCTAAACCGA CTGCCCACTC CTTCGCCTGA GCCGCCTCCA 3660 

AGTCTCGTCC GCTTTGCCOT GCCTCCTGGT CACACGAACA ATGGGTCTGA TTCCTCCGAC 3720 

TCGGAGTACA GCTCTCAGAC CACGCTGTCT GGCATCAGTG AGGAGCTCAG GCAATACCAA 3780 

GCACAGCAGG GTCCCGGAGG CCCTGCCCAC CAAGTGATTG TGGAAGCCAC AGAAAACCCT 3840 

GTCTTTGCCC GCTCCACTCT GGTCCAICCX; GACTCCAGAC ATCAGCCTCC CTTGACCCCT 3900 

CCCCAACAGC CCCACCTGCA CTCTGGCTCC TTGTCCCCTG GACGGCAACG CCAGCACCCT 3960 

CGAAGGGATC CCCCTAGAGA AGGCTTOOGC CCACCCCCCT ACACACCCCC CACAGAOGCT 4020 

TTTGAAATTT CTACTGAAGG CCATTCTGCC CCTAGCAATA GGGACCGCTC AGGGCCCCGT 4080 

GGGGCCCGTT CTCaCAACCC TCGGAACCCA ACGTCCACCC CCATGGGCAG CTCTGTGCCC 4140 

AGCTACrrGCC AGCCCATCAC CACTGT6ACG CCTTCTGCTT CGGTGACTGT TCCT6TGCAT 4200 

CCCCC6CCTC GACCTGOGCG CAACCCC06A OOOGGCCCCT GTCCAGCCTA TGACAGCTAC 4260 

CCTGAGACTO ATCAC6GG6T ATTT6AGCAT CCTCATOTGC CTTTTCArGT CAOGTGTGAG 4320 

AGGAGGGACT CftAAGCTGGA OGTCaiAGAG CTACAGGACG TGGAATGTGA 6GAGAGGCCG 4380 

TGGGGGAGCA CCTCCAACTG AGGGTAATTA AAATCTGAAG CAAAGAGGCC AAAGATTCCA 4440 

AA6CCCCGCC CCCACCTCTT TCCAGAACTC CTTGAAGAGA ACTGCTTGCA ATTATGGGAA 4500 

GCCACTTCAT TGTTACTOTA ACTGATTCTA TTATTKKGTO AAATATTTCT ATAAATATTT 4560 

AARAGGTCTA CACATCTAAT ATACATOOAA ATGCTGTACA GTCTATTTCC TCGCGCCTCT 4620 
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CCACTCCTi»w 


CCCAGAGTGG CGAGACCACA GGGCCCCTTT 


CCCCTCTCTA CATTGGTCTC 


4680 


fff*^ A & & 

TGTGCCAWirt 


CC»AOCTTAA CTTAOTTTTA AAAAAAATCT 


CCCAGCATAT CTCGCTGCTG 


4740 


otth a A.T ATT 


GTATAATTTA CTTGTATAAT TCTATCCAAA 


TATTGCTTAT GTAATAGGAT 


4800 


TATTTGTAAft 


GGTTTCTGTT TAAAATATTT TAAATTTGCA 


TATCACAACC CTGTGGTAGG 


4860 


ATGAATTGTT 


ACTGTTAACT TTTGAACACC CTATGCGTGG 


TAATTGTTTA ACGAGCAGAC 


4920 


ATGAAGAAAA 


CAGGTTAATC CCAGTCGCTT CTCTAGGGGT 


AGTTGTATAT GGTTCGCATG 


4980 


GGTGGATGTG 


TGTGTGCATG TGACTTTCCA ATGTACTGTA 


TTGTGGTTTG TTGTTGTTGT 


5040 


TGCTGTTGTT 


GTTCATTTTG GTGTTTTTGC TTCCTTTGTA 


TGATCTTAGC TCTGGCCTAG 


5100 


GTGGGCTGGG 


AAGGTCCAGG TCrTTTTCTG TCGTGATGCT 


GGTGGAAAGG TGACCCCAAT 


5160 
5187 


CATCTGTCCT ATTCTCTGGG ACTATTC 







(2) INFORMATION FOR SEQ ID NOtlO: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1434 amino acids 
|B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECXn*E TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Ala Ser Ala Gly Asn Ala Ala Gly Ala Leu Gly Arg Gin Ala Gly 
Gly Gly Arg Arg Arg Arg Thr Gly Gly Pro His Arg Ala Ala Pro Asp 



20 



Arg ASP Tyr Leu His Arg Pro Ser Tyr Cys Asp Ala Ala Phe Ala Leu 



35 



Olu Gin lie Ser Lye Gly Lys Ala Thr Gly Arg Lys Ala Pro Leu Trp 



55 



50 

Leu Arg Ala Ly. Phe Gin Arg Le« I-u Phe Lys Gly Cys Tyr lie 
65 

Gin Lys Asn Cys Gly Lye Phe Leu Val Val Gly Le« Leu lie Phe Gly 
85 

^a Phe Ala Val Gly Leu Lys Ala Ala Asn Leu Glu Thr Asn Val Glu 
100 

Glu Leu Trp Val Glu Val Gly Gly Arg Val Ser Arg Glu Leu Asn Tyr 
115 "0 
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Thr Arg Gin Lya lie Gly Glu Clu Ala Met Phe Aan Pro Gin Leu Met 
130 135 140 

He Gin Thr Pro Lya Glu Glu Gly Ala Aan Val Leu Thr Thr Glu Ala 
145 150 155 

Leu Leu Gin His Leu Aap Ser Ala Leu Gin Ala Ser Arg Val His Val 
165 170 175 

Tyr Met Tyr Aan Arg Gin Trp Lye Leu Olu Hia Leu Cye Tyr Lya Ser 
180 185 190 

Gly Glu Leu He Thr Glu Thr Gly Tyr Met Asp Gin He He Glu Tyr 
195 200 205 

Leu Tyr Pro Cye Leu He He Thr Pro Leu Asp Cys Phe Trp Glu Gly 
210 215 220 

Ala Lys Leu Gin Ser Gly Thr Ala Tyr Leu Leu Gly Lys Pro Pro Leu 
225 230 235 240 

Arg Trp Thr Asn Phe Asp Pro Leu Clu Phe Leu Glu Glu Leu Lys Lys 
245 250 255 

He Asn Tyr Gin Val Asp Ser Trp Glu Glu Met Leu Asn Lys Ala Glu 
260 265 270 

Val Gly His Gly Tyr Met Asp Arg Pro Cys Leu Asn Pro Ala Asp Pro 
275 280 285 

Asp Cys Pro Ala Thr Ala Pro Aan Lys Asn Ser Thr Lys Pro Leu Asp 
290 295 300 

Val Ala Leu Val Leu Asn Gly Gly Cys Gin Gly Leu Ser Arg Lys Tyr 
305 310 315 320 

Met His Trp Gin Glu Glu Leu He Val Gly Gly Thr Val Lya Asn Ala 
325 330 335 

Thr Gly Lys Leu Val ser Ala His Ala Leu Gin Thr Met Phe Gin Leu 
340 345 350 

Met Thr Pro Lys Gin Met Tyr Clu His Phe Arg Gly Tyr Asp Tyr Val 
355 360 365 

Ser Hie He Asn Trp Asn Clu Asp Arg Ala Ala Ala He Leu Glu Ala 
370 375 380 

Trp Gin Arg Thr Tyr Val Glu Val Val Hia Gin Ser Val Ala Pro Aan 
385 390 395 400 

Ser Thr Gin Lys Val Leu Pro Phe Thr Thr Thr Thr Leu Aap Asp He 
405 410 415 

Leu Lys Ser Phe Ser Asp Val Ser Val He Arg Val Ala Ser Gly Tyr 
420 425 430 



Leu Leu M t Leu Ala Tyr Ala Cys Leu Thr Met Leu Arg Trp Asp Cys 
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435 440 445 

Ser Lys Ser Gin Gly Ala Val Gly Leu Ala Gly Val Leu Leu Val Ala 
450 455 460 

Leu ser Val Ala Ala Gly Leu Gly Leu Cys Ser Leu He Gly He Ser 
465 470 475 480 

Phe Asn Ala Ala Thr Thr Gin Val Leu Pro Phe Leu Ala Leu Gly Val 
485 490 495 

Gly val Asp Asp Val Phe Leu Leu Ala His Ala Phe Ser Glu Thr Gly 
500 505 510 

Gin Asn Lye Arg He Pro Phe Glu Asp Arg Thr Gly Glu Cys Leu Lys 
515 520 525 

Arg Thr Gly Ala Ser Val Ala Leu Thr Ser He Ser Asn Val Thr Ala 
530 535 540 

Phe Phe Met Ala Ala Leu He Pro He Pro Ala Leu Arg Ala Phe Ser 
545 550 555 560 

Leu Gin Ala Ala Val Val Val Val Phe Asn Phe Ala Met Val Leu Leu 
565 570 575 

He Phe Pro Ala He Leu Ser Met Asp Leu Tyr Arg Arg Glu Asp Arg 
580 585 590 

Arg Leu Asp He Phe Cys Cys Phe Thr Ser Pro Cys Val Ser Arg Val 
595 600 605 

He Gin Val Glu Pro Gin Ala Tyr Thr Glu Pro His Ser Asn Thr Arg 
610 615 620 

Tyr Ser Pro Pro Pro Pro Tyr Thr Ser His Ser Phe Ala His Glu Thr 
625 "0 635 640 

His He Thr Met Gin Ser Thr Val Gin Leu Arg Thr Glu Tyr Asp Pro 
645 650 655 

His Thr His Val Tyr Tyr Thr Thr Ala Glu Pro Arg Ser Glu He Ser 
660 665 670 

Val Gin Pro Val Thr Val Thr Gin Asp Asn Leu Ser Cys Gin Ser Pro 
675 680 685 

Glu ser Thr Ser Ser Thr Arg Asp Leu Leu Ser Gin Phe Ser Asp Ser 
690 695 700 

Ser Leu His Cys Leu Glu Pro Pro Cys Thr Lys Trp Thr Leu Ser Ser 
705 710 715 720 

Phe Ala Glu Lys His Tyr Ala Pro Phe Leu Leu Lys Pro Lys Ala Lys 
725 730 735 

Val Val Val H Leu Leu Phe Leu Gly Leu Leu Gly Val Ser Leu Tyr 
740 745 750 
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Gly Thr Thr Arg Val Arg Aep Gly Leu Asp Leu Thr Asp He Val Pro 
755 760 

Arg Olu Thr Arg Glu Tyr Asp Phe He Ala Ala Gin Phe Lys Tyr Phe 
''° 775 780 

ser Pha Tyr Aan Met Tyr lie Val Thr Gin Lye Ala Aep Tyr Pro Aen 



785 790 



800 



He Gin His Leu Leu Tyr Asp Leu His Lys Ser Phe Ser Asn Val Lys 
805 810 815 

Tyr Val Met Leu Glu Glu Asn Lys Gin Leu Pro Gin Met Trp Leu His 
820 825 830 

Tyr Phe Arg Asp Trp Leu Gin Gly Leu Gin Asp Ala Phe Asp Ser Asp 
835 840 845 

Trp Glu Thr Gly Arg He Met Pro Asn Asn Tyr Lys Asn Gly ser Asp 
"0 855 860 

Asp Gly val Leu Ala Tyr Lys Leu Leu Val Gin Thr Gly Ser Arg Asp 
^" 870 875 ^ 880 

Lys Pro He Asp He ser Gin Leu Thr Lys Gin Arg Leu Val Asp Ala 
885 890 895 

Asp Gly He He Asn Pro Ser Ala Phe Tyr He Tyr Leu Thr Ala Trp 
900 905 

Val ser Asn Asp Pro Val Ala Tyr Ala Ala Ser Gin Ala Asn He Arg 
'^5 920 925 

Pro His Arg Pro Glu Trp Val His Asp Lys Ala Asp Tyr Met Pro Glu 
930 935 

Thr Arg Leu Arg He Pro Ala Ala Glu Pro He Glu Tyr Ala Oln Phe 

950 95S 960 

Pro Phe Tyr Leu Asn Gly Leu Arg Asp Thr Ser Asp Phe Val Glu Ala 
965 970 

He Glu Lys Val Arg Val He Cy. Asn Asn Tyr Thr Ser Leu Gly Leu 
980 985 990 ^ 

Ser Ser Tyr Pro Asn Gly Tyr Pro Phe Leu Phe Trp Glu Gin Tyr He 
995 1000 1005 

^tJ^^ 8er val Val Leu Ala Cys 

1010 1015 1020 

Thr Phe Leu Val Cys Ala Val Phe Leu Leu Aen Pro Trp Thr Ala Gly 

1030 1035 loio 

He He Val Met Val Leu Ala Leu M t Thr Val Glu Leu Ph Gly Met 
1045 1050 1055 

Met Gly Leu H Gly He Lys L u Ser Ala Val Pro Val Val He Leu 
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1060 1070 

lie Ala Ser Val Cly He Gly Val Glu Phe Thr Val Hie Val Ala Leu 
1075 1080 1085 

Ala Phe I^u Thr Ala He Gly Asp Lye Asn HIb Arg Ala Met I*u Ala 
1090 1095 1100 

teu Glu HlB Met Phe Ala Pro Val Leu Asp Gly Ala Val Ser Thr Leu 
1105 "10 1115 

Leu Gly val Leu Met Leu Ala Gly Ser Glu Phe Asp Phe He Val Arg 

1125 1130 

Tyr Phe Phe Ala Val Leu Ala He Leu Thr Val Leu Gly Val Leu Asn 
1140 1145 iiau 



Gly Leu val Leu Leu Pro Val Leu Leu Ser Phe Phe Cly Pro Cys Pro 
^ 1155 1160 1"5 

Glu val ser Pro Ala Asn Gly Leu Asn Arg Leu Pro Thr Pro Ser Pro 
1170 1175 1180 

Glu Pro pro pro Ser val Val Arg Phe Ala Val Pro Pro Gly His Thr 
1185 1190 1"5 

Asn Asn Gly ser Aap Ser Ser Asp Ser Glu Jyr Ser Ser Gin Thr^Thr 



1205 



val ser Gly He Ser Glu Glu Leu Arg Gin Tyr Glu Ala Gin Gin Gly 
1220 1225 1230 

Ala Gly Gly Pro Ala His Gin Val He Val Glu Ala Thr Glu Asn Pro 
1235 1240 1245 



val Phe Ala Arg Ser Thr Val Val His Pro Asp Ser^Arg His Gin Pro 
1250 



1255 1260 



1270 1275 1280 



pro Leu Thr Pro Arg Gin Gin Pro His Leu Asp^Ser Gly Ser Leu Ser 
1265 

Pro Gly Arg Gin Gly Gin Gin Pro Arg Arg Asp Pro Pro Arg Glu Gly 
1285 1290 

Leu Arg Pro Pro Pro Tyr Arg Pro Arg Arg Asp Ala Phe Glu He Ser 
1300 1305 1310 

Thr Glu Gly Hie Ser Gly Pro Ser Asn Arg Asp Arg Ser Gly Pro Arg 
1315 1320 1325 

Gly Ala Arg Ser His Asn Pro Arg Asn Pro Thr Ser Thr Ala Met Gly 
1330 1335 1340 

ser ser Val Pr Ser Tyr Cys Gin Pro He Thr Thr val Thr Ala Ser 
1345 1350 1355 

Ala ser Val Thr Val Ala Val His Pr Pro Pro Gly Pro Gly Arg Asn 
1365 13"0 
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Pro Arg Gly Gly Pro Cya Pro Gly Tyr Glu S r Tyr Pro Glu Thr Asp 

1385 1390 

His Gly yal Phe Glu Asp Pro His Val Pro Phe His Val Arg Cys Glu 

1400 2405 

u?o^°^ ^^"^ ""^^ aiis''*^ ""^^ '^^P 



1420 



Glu Glu Arg Pro Trp Gly Ser Ser Ser Asn 
1425 1430 

(2) INFORMATION FOR SEQ ID NO: lit 

(i) SEQUENCE CHARACTERISTICS 3 

(A) LENGTH: 11 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESSt single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOrll: 

He He Thr Pro Leu Asp Cys Phe Trp Glu Gly 
^ 5 10 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
„<D) TOPOLOGY: linear 

(ii) MOLECULE -TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Leu He Val Gly Gly 
1 5 

( 2 ) INFORMATION FOR SEQ ID NO j 13 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESSt singl 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



51 



wo 96/11260 



PCT/US95/13233 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Pro Phe Phe Trp Glu Gin Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 baflO pair a 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /dasc » "primer" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGACGAATTC AARGTNCAYC ARYTNTGG 
(2) INFORMATION FOR SEQ ID NO: 15s 

(i) SEQUENCE CHARACTERISTICS X 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /deflc - "primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GGACGAATTC CYTCCCARAA RCANTC 
(2) INFORMATION FOR SEQ ID HOtl6» 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPES nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc » -primer" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
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GGACGAATTC yTKCANTGyT TCTCGGA 
(2) INFORMATION TOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc « "primer** 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17s 
CATACCAGCC AAGCTTGTCN GGCCARTGCA T 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5288 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GAATTCCGCG GACCGCAAGC AGTGCCGCGG AAGCGCCCCA AGGACAGGCT CGCTCGGCGC 60 

GCCGGCTCTC GCTCTTCCGC GAACTGGATG TGGGCAGCGC CGGCCGCAGA GACCTCGGGA 120 

CCCCCGCGCA ATGTGGCAAT GGAAGGCGCA GGGTCTGACT CCCCGGCAGC GGCCGCGGCC 180 

GCAGCGGCAG CAGCGCCCGC OGTGTGAGCA CCAGCAGCGG CTGGTCTGTC AACCGGAGCC 240 

CGAGCCCGAG CAGCCTGCGG CCAGCAGCGT CCTCGCAAGC C6AGCGCCCA CGOGCGCCAG 300 

GAGCCCGCAG CACCGGCACC AGCGOGCCGG GCCGCCCGGG AACCCTCCGT CCCCGCGGCG 360 

GCGCCGGCGG CGGCGGCGGC AACATGCCCT CCGCTGGTAA CGCCGCCGAG CCCCAGGACC 420 

GCGGCGGCGG CGGCAGCGGC TGTATCGGTG CCCCGGGACG GCCGGCTGGA GGCGGGAGGC 480 

GCAGACGGAC GGGGGG6CTG 06CCGTGCTG CCCCGCCGGA CCGGGACTAT CTGCACCGGC 540 

CCAGCTACTC CGACGCCGCC TTCGCTCTGG AGCAGATTTC CAAGGGGAAG GCTACTCCCC 600 

GGAAAGCGCC ACTGTGGCTG AGACOGAAGT TTCAGAGACT CTTATTTAAA CTGGGTTGTT 660 

ACATTCAAAA AAACTGCGGC AAGTTCTTGG TTGTGGGCCT CCTCATATTT GGGGCCTTCG 720 
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CGGTGGGATT AAAAGCAGCG AACCTCGAGA CCAACGTCGA CGAGCTCTGG GTGGAAGTTG 780 

GAGGACGAGT AAGTCGTGAA TTAAATTATA CTCGCCAGAA GATTGGAGAA GAGGCTATGT 840 

TTAATCCTCA ACTCATGATA CAGACCCCTA AAGAAGAAGG TGCTAATGTC CTGACCACAG 900 

AAGCGCTCCT ACAACACCTC GACTCGGCAC TCCAGGCCAG CCGTCTCCAT GTATACATGT 960 

ACAACACGCA GTGGAAATTG GAACATTTGT GTTACAAATC AGGACAGCTT ATCACAGAAA 1020 

CAGGTTACAT GGATCAGATA ATAGAATATC TTTACCCTTG TTTGATTATT ACACCTTTCG 1080 

ACTGCTTCTG GGAAGGGGCG AAATTACAGT CTGGGACAGC ATACCTCCTA GGTAAACCTC 1140 

CTTTGCGGTG GACAAACTTC GACCCTTTGG AATTCCTGGA AGAGTTAAAG AAAATAAACT 1200 

ATCAAGTGGA CAGCTGGGAG GAAATGCTGA ATAAGGCTGA GGTTGGTCAT GGTTACATGG 1260 

ACCGCCCCTG CCTCAATCCG GCCGATCCAG ACTGCCCCGC CACACCCCCC AACAAAAATT 1320 

CAACCAAACC TCTTGATATG GCCCTrGTTT TGAATCGTGC ATGTCATGGC TTATCCAGAA 1380 

AGTATATGCA CTGGCAGGAG GAGTTGATTG TGGGTGGCAC AGTCAAGAAC AGCACTGGAA 1440 

AACTCGTCAG CGCCCATGCC CTGCAGACCA TGTTCCAGTT AATGACTCCC AAGCAAATGT 1500 

ACGAGCACTT CAAGGGGTAC GAGTATGTCT CACACATCAA CTGGAACGAG GACAAAGCGG 1560 

CAGCCATCCT GGAGGCCTGG CAGAGGACAT ATGTGGAGGT GGTTCATCAG AGTCTCGCAC 1620 

AGAACTCCAC TCAAAAGGTG CTTTCCTTCA CCACCACGAC CCTGGACGAC ATCCTGAAAT 1680 

CCTTCTCTGA CGTCAGTGTC ATCCGCGTGG CCAGCGGCTA CTTACTCATG CTCGCCTATG 1740 

CCTGTCTAAC CATGCTGCGC TGGGACTGCT CCAAGTCCCA GGGTGCCGTG GGGCTGGCTG 1800 

GCGTCCTGCT GGTTGCACTG TCAGTGCCTG CAGGACTGGG CCTGTGCTCA TTGATCGCAA 1860 

TTTCCTTTAA CGCTGCAACaV ACTCAGGTTT TGCCATTTCT CGCTCTTCGT GTTGGTGTGG 1920 

ATGATGTTTT TCTTCTGGCC CACGCCTTCA GTGAAACAGG ACAGAATAAA AGAATCCCTT 1980 

TTGACGACAG GACCGGGGAG TCCCTGAAGC GCACAGGAGC CAGCGTGGCC CTCACGTCCA 2040 

TCAGCAATGT CACAGCCTTC TTCATCCCCC CGTTAATCCC AATTCCCGCT CTGCGGGCGT 2100 

TCTCCCTCCA GGCAGCGGTA GTAGTCGTGT TCAATTTTGC CATGGTTCTG CTCATTTTTC 2160 

CTGCAATTCT CAGCATGGAT TTATATCGAC GCGAGGACAG GAGACTGGAT ATTTTCTGCT 2220 

GTTTTACAAG CCCCTGCGIC AGCAGAGTGA TrCAGGTTGA ACCTCAGGCC TACACCGACA 2280 
CACACGACAA TACCOGCTAC A6CCCCCCAC CTCCCTACAG CAGCCACAGC TTTGCCCATG 
AAACGCAGAT TACCaTGCAG TCCACTGTCC AGCTCCGCaVC GCAGTACGAC CCCCACACGC 
ACGTGTACTA CACCACCGCT GAGCC6CGCT CCGAGATCTC TGTGCAGCCC GTCACCGTGA 



2340 
2400 
2460 
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CACAGGACAC CCTCAGCTGC CAGA6CCCAG AGAGCACCAG CTCCACAAGG GACCTCCTCT 2520 

CCCAGTTCTC CGACTCGAGC CTCCACTGCC TOGAGCCCCC CTCTACGAAG TGGACACTCT 2580 

CATCTTTTGC TGAGAAGCAC TATGCTCCTT TCCTCTTGAA ACCAAAAGCC AAGGTAGTGG 2640 

TGATCTTCCT TTTTCTGGGC TTGCTOGCOG TCACCCTTTA TGGCACCACC CCAGTCAGAG 2700 

ACGGGCTGGA CCTTACGGAC ATTGTACCTC CGGAAACCAG AGAATATGAC TTTATTGCTG 2760 

CACAATTCAA ATACTTTTCT TTCTACAACA TOTATATACT CACCCAGAAA GCAGACTACC 2820 

CGAATATCCA GCACTTACTT TACGACCTAC ACAGGAGTTT CAGTAACGTG AAGTATGTCA 2880 

TGTTGGAAGA AAACAAACAG CTTCCCAAAA TCTOGCTOCA CTACTTCAGA GACTGGCTTC 2940 

AGGGACTTCA GGATGCATTT GACAGTGACT GGGAAACCGG GAAAATCATG CCAAACAATT 3000 

ACAAGAATGG ATCAGACGAT GGAGTCCTTG CCTACAAACT CCTGGTGCAA ACCGGCAGCC 3060 

GCGATAAGCC CATCGACATC AGCCAGTTGA CTAAACAGCG TCTGGTGGAT GCAGATGGCA 3120 

TCATTAATCC CAGCGCTTTC TACATCTACC TGACGGCTTG GGTCAGCAAC GACCCCGTCG 3180 

CGTATGCTGC CTCCCAGGCC AACATCCOGC CACACCGACC AGAATGGGTC CACGACAAAG 3240 

CCGACTACAT GCCTGAAACA AGGCTGAGAA TCCOGGCAGC AGAGCCCATC GAGTATGCCC 3300 

AGTTCCCTTT CTACCTCAAC GGGTTGCGGG ACACCTCAGA CTTTGTGGAG GCAATTCAAA 3360 

AAGTAAGGAC CATCTGCAGC AACTATACGA GCCTGGGGCT GTCCAGTTAC CCCAACGGCT 3420 

ACCCCTTCCT CTTCTGGGAG CAGTACATOG GCCTCOGCCA CTGGCTGCTG CTGTTCATCA 3480 

GCGTGGTGTT GGCCTGCACA TTCCTCCTGT OCGCTGTCTT CCTTCTGAAC CCCTGGACGG 3540 

CCGGGATCAT TGTGATGCTC CTGGOGCTCA TCACGGTCGA GCTGTTCGCC ATGATGGGCC 3600 

TCATCGGAAT CAAGCTCAGT GCCXSTGCCCG TCCTCATCCT CATCGCTTCT GTTGGCATAG 3660 

GAGTGGAGTT CACCGTTCAC GTTGCTTT6G CCTTTCTGAC GGCCATCGGC GACAAGAACC 3720 

GCAGGGCTGT CCTTGCCCTG GAGCACATGT TTGCACCCGT CCTGGATGGC CCCGTGTCCA 3780 

CTCTGCTGGC AGTGCTGATG CTGGCGGGAT CTGAGTTCGA CTTCATTGTC AGGTATTTCT 3840 

TTGCTGTGCT GGCGATCCTC ACCATCCTCG GOGTTCTCAA TGGGCTGCTT TTGCTTCCCG 3900 

TGCTTTTGTC TTTCTTTGGA CCATATCCTG AGGTGTCTCC AGCCAAOGGC TTGAACOGCC 3960 

TGCCCAGACC CTCCCCTGAG CCACCCCCCA G0GTGGTCC6 CTTCGCCATG CCGCCCGGCC 4020 

ACACGCACAG CGGGTCTGAT TCCTCCCACT CGGAGTATAG TTCCCAGACG ACAGTGTGAG 4080 

GCCTCAGCGA GGAGCTTCXX; CACTACGAGG CCCAGCAGGG CGCGGGAGGC CCXGCCGACC 4140 

AAGTGATCGT GGAAGCCACA GAAAACCCCG TCTXCGCCCA CTCCACTGTG GTCCATCCOG 4200 
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AATCCAGCCA TCACCCACCC TCGAACCCCA CACAGCACCC CCACCTOCAC TCAGGCTCCC 
TGCCTCCCGC ACGOCAAGGC CAGCACCCCC GCAGCGACCC CCCCACAGAA GGCTTGTGGC 
CACCCCTCTA CAGACCGCGC AGAGACGCTT TTGAAATTTC TACTGAAGGG CATTCTGGCC 
CTAGCAATAG GGCCCCCTGG GGCCCTCGCG CGCCCCGTTC TCACAACCCT CGGAACCCAC 
CGTCCACTGC CATGGCCAGC TCC6TCCCCG GCTACTGCCA CCCCATCACC ACTGTGACGO 
CTTCTCCCTC CGTOACTGTC CCCGTCCACC CGCCGCCTGT CCCTCCGCCT GGGCGCAACC 
CCCGAGGGGG ACTCTGCCCA GGCTACCCTC AGACTGACCA CGGCCTGTTT GAGGACCCCC 
ACGTGCCTTT CCACGTCCGG TGTGAGAGGA GGGATTCGAA GGTGGAAGTC ATTGAGCTGC 
AOGACGTCGA ATGCGAGGAG AGGCCCCGGG GAAGCAGCTC CAACTGAGGG TGATTAAAAT 
CTGAAGCAAA GAGGCCAAAG ATTGGAAACC CCCCACCCCC ACCTCTTTCC AGAACTGCTT 
GAAGAGAACT GGTTGGAGTT ATGGAAAAGA TGCCCTGTGC CAGGACAGCA GTTCATTGTT 
ACTGTAACCG ATTGTATTAT TTTGTTAAAT ATTTCTATAA ATATTTAAGA GATGTACACA 
TGTGTAATAT AGGAAGGAAG GATGTAAAGT GGTATGATCT GGGGCTTCTC CACTCCTGCC 
CCAGAGTGTG GAGGCCACAG TGGGGCCTCT CCGTATTTGT GCATTGGGCT CCGTGCCACA 
ACCAAGCTTC ATTAGTCTTA AATTTCAGCA TATGTTGCTG CTGCTTAAAT ATTGTATAAT 
TTACTTGTAT AATTCTATGC AAATATTGCT TATGTAATAG GATTATTTTG TAAAGCTTTC 
TGTTTAAAAT ATTTTAAATT TGCATATCAC AACCCTGTGG TAGTATGAAA TGTTACT6TT 
AACTTTCAAA CACGCTATGC GTGATAATTT TTTTCTTTAA TGAGCAGATA TCAAGAAAGC 
CCGGAATT 

(2) INFORMATION FOR SEQ ID NO J 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1447 amino aclde 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5288 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Ala ser Ala Gly Aan Ala Ala Glu Pro Gin Asp Arg Gly Gly Gly 

Gly ser Gly Cya He Gly Ala Pro Gly Arg Pro Ala Gly Gly Gly Arg 
20 25 
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Arg Arg Arg Thr Gly Gly L«u Arg Arg Ala Ala Ala Pro Asp Arg Asp 
35 40 45 

Tyr Leu His Arg Pro Ser Tyr Cys Asp Ala Ala Phe Ala Leu Glu Gin 
50 55 60 

lie Ser Lys Gly Lys Ala Thr Gly Arg Lye Ala Pro Leu Trp Leu Arg 
65 70 75 80 

Ala Lys Phe Gin Arg Leu Leu Phe Lys Leu Gly Cys Tyr lie Gin Lys 
85 90 95 

Asn Cys Gly Lys Phe Leu Val Val Gly Leu Leu He Phe Gly Ala Phe 
100 105 110 

Ala Val Gly Leu Lys Ala Ala Asn Leu Glu Thr Asn Val Glu Glu Leu 
lis 120 125 

Trp Val Glu Val Gly Gly Arg Val Ser Arg Glu Leu Asn Tyr Thr Arg 
130 135 140 

Gin Lys lie Gly Glu Glu Ala Met Phe Asn Pro Gin Leu Met He Gin 
145 150 155 160 

Thr Pro Lys Glu Glu Gly Ala Asn Val Leu Thr Thr Glu Ala Leu Leu 
165 170 175 

Gin His Leu Asp Ser Ala Leu Gin Ala Ser Arg Val His Val Tyr Met 
180 185 190 

Tyr Aan Arg Gin Trp Lys Leu Glu His Leu Cys Tyr Lys Ser Gly Glu 
195 200 205 

Leu He Thr Glu Thr Gly Tyr Met Asp Gin He He Glu Tyr Leu Tyr 
210 215 220 

Pro Cys Leu He He Thr Pro Leu Asp Cys Phe Trp Glu Gly Ala Lys 
225 230 235 240 

Leu Gin Ser Gly Thr Ala Tyr Leu Leu Gly Lys Pro Pro Leu Arg Trp 
245 250 255 

Thr Asn Phe Asp Pro Leu Glu Phe Leu Glu Glu Leu Lys Lys He Asn 
260 265 270 

Tyr Gin Val Asp Ser Trp Glu Glu Met Leu Asn Lys Ala Glu Val Gly 
275 280 285 

His Gly Tyr Met Asp Arg Pro Cys Leu Asn Pro Ala Asp Pro Asp Cys 
290 295 300 

Pro Ala Thr Ala Pro Asn Lys Asn Ser Thr Lys Pro Leu Asp Met Ala 
305 310 315 320 

Leu Val Leu Asn Gly Gly Cys Hie Gly Leu Ser Arg Lys Tyr Met His 
325 330 335 



Trp Gin Glu Glu Leu He Val Gly Gly Thr Val Lys Asn Ser Thr Gly 
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340 

I.yB Leu val ser Al. His Ala Leu Gin Thr Met Phe Gin Leu Met Thr 

355 

P.O Lye Gin Met Tyr Glu Hie Phe Lys Gly Tyr Glu Tyr Val ser His 

370 ^''^ 
Xle Asn Trp Asn Glu Aap Lye Ala Ma Ala lie Leu Glu Ala Trp Gin 
385 390 395 



345 



350 



j^, Thr Tyr Val Glu Val Val His Gin Ser Val Ala Gin Asn Ser Thr 
405 

Gin Lys val Leu Ser Phe Thr Thr Thr Thr Leu Asp Asp lie Leu Lys 
420 

ser Phe Ser Asp Val Ser Val lie Arg Val Ala Ser Gly Tyr Leu Leu 
435 

Met Leu Ala Tyr Ala Cys Leu Thr Met Leu Arg Trp Asp Cys Ser Lys 

450 *55 
ser Gin Gly Ala Val Gly Leu Ala Gly Val Leu Leu Val Ala Leu Ser 
465 

val Ala Ala Gly Leu Gly Leu cys Ser Leu He Gly He Ser Phe Asn 



485 

.la Ala Thr Thr Gin Val Leu Pro Phe Leu Ala Leu Gly Val Gly Val 
500 

.sp ASP val Phe Leu Leu Ala His Ala Phe Ser Glu Thr Gly Gin Asn 
515 

Lys Arg He Pro Phe Glu Asp Arg Thr Gly Glu Cys Leu Lys Arg Thr 
530 

=1V Al. S„ v.l «. L.U ,hr se. II. ser «» V.l T.r Ph. Ph. 

545 

Met Ala Ala Leu He Pro He Pro Ala Leu Arg Ala Phe Ser Leu Gin 



565 

Ma Ala val Val Val Val Phe Asn Phe Ala Met Val Leu Leu He Phe 
580 

pro Ala He Leu Ser Met Asp Leu Tyr Arg Arg Glu Asp Arg Arg Leu 
595 

„p XI. Ph. C cy. Ph. Thr s.r Pro C V.1 s.r v.l XI. CX„ 



610 



V.1 Olo pre Ol- «. Tyr Thr ».P Thr HI. «P «» Thr Tyr s« 



625 



630 



pro pro pro Pro Tyr S.r s.r HI. s.r Ph Al. HU «u Thr 01» II. 



645 



650 
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Thr Met Gin Ser Thr Val Gin Leu Arg Thr Clu Tyr Aap Pro His Thr 
660 665 670 

His Val Tyr Tyr Thr Thr Ala Glu Pro Arg Ser Glu lie Ser Val Gin 
675 680 685 

Pro Val Thr Val Thr Gin Asp Thr Leu Ser Cys Gin Ser Pro Glu Ser 
690 695 700 

Thr Ser Ser Thr Arg Asp X^u Leu Ser Gin Phe Ser Asp Ser Ser Leu 
705 710 715 720 

His Cys Leu Glu Pro Pro Cys Thr Lye Trp Thr Leu Ser Ser Phe Ala 
725 730 735 

Glu Lys His Tyr Ala Pro Phe Leu Leu Lys Pro Lys Ala Lys Val Val 
740 745 750 

Val lie Phe Leu Pho Leu Gly Leu Leu Gly Val Ser Leu Tyr Gly Thr 
755 760 765 

Thr Arg Val Arg Asp Gly Leu Asp Leu Thr Asp lie Val Pro Arg Glu 
770 775 780 

Thr Arg Glu Tyr Asp Phe lie Ala Ala Gin Phe Lys Tyr Phe Ser Phe 
785 790 795 800 

Tyr Asn Met Tyr He Val Thr Gin Lys Ala Asp Tyr Pro Asn He Gin 
805 810 815 

His Leu Leu Tyr Asp Leu His Arg Ser Phe Ser Asn Val Lys Tyr Val 
820 825 830 

Met Leu Clu Glu Asn Lys Gin Leu Pro Lys Met Trp Leu His Tyr Phe 
835 840 845 

Arg Asp Trp Leu Gin Gly Leu Gin Asp Ala Phe Asp Ser Asp Trp Glu 
850 855 860 

Thr Gly Lys He Met Pro Asn Asn Tyr Lys Asn Gly Ser Asp Asp Gly 
865 870 875 880 

Val Leu Ala Tyr Lys Leu Leu Val Gin Thr Gly Ser Arg Asp Lys Pro 
885 890 895 

He Asp He Ser Gin Leu Thr Lys Gin Arg Leu Val Asp Ala Asp Gly 
900 905 910 

He He Asn Pro Ser Ala Phe Tyr He Tyr Leu Thr Ala Trp Val Ser 
915 920 925 

Asn Asp Pro Val Ala Tyr Ala Ala Ser Gin Ala Asn He Arg Pro His 
930 935 940 

Arg Pro Glu Trp Val His Asp Lys Ala Asp Tyr Met Pro Glu Thr Arg 
W5 950 955 960 



Leu Arg He Pro Ala Ala Glu Pro H Glu Tyr Ala Gin Ph Pr Phe 
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965 970 975 

Tyr Leu Asn Gly Leu Arg Asp Thr Ser Asp Phe Val Glu Ala He Glu 
980 985 990 

Lye val Arg Thr He Cys Ser Asn Tyr Thr Ser Leu Gly Leu Ser Ser 
995 1000 1005 

Tyr Pro Aan Gly Tyr Pro Phe Leu Phe Trp Glu Gin Tyr He Gly Leu 
1010 1015 1020 

A,-o Hia Tro Leu Leu Leu Phe He Ser Val Val Leu Ala Cys Thr Phe 
fols 1030 

Leu Val Cya Ala Val Phe Leu Leu Asn Pro Trp Thr Ala Gly He He 
1045 1050 1055 

Val Met Val Leu Ala Leu Met Thr Val Glu Leu Phe Gly Met Met Gly 
1060 1065 1070 

Leu He Gly He Lya Leu Ser Ala Val Pro Val Val He Leu He Ala 
1075 1080 1085 

ser Val Gly He Gly Val Glu Phe Thr Val His Val Ala Leu Ala Phe 
1090 1095 1100 

Leu Thr Ala He Gly Asp Lye Aen Arg Arg Ala Val Leu Ala Leu Glu 
1105 1110 1115 1120 

His Met Phe Ala Pro Val Leu Asp Gly Ala Val Ser Thr Leu Leu Gly 
1125 1130 1135 

Val Leu Met Leu Ala Gly Ser Glu Phe Asp Phe He Val Arg Tyr Phe 
1140 1145 1150 

Phe Ala Val Leu Ala He Leu Thr He Leu Gly Val Leu Asn Gly Leu 
1155 1160 1165 

Val Leu Leu Pro Val Leu Leu Ser Phe Phe Gly Pro Tyr Pro Glu Val 
1170 1175 1180 

Ser Pro Ala Aan Gly Leu Asn Arg Leu Pro Thr Pro ser Pro Glu Pro 
1185 1190 1195 1200 

Pro Pro ser Val Val Arg Phe Ala Met Pro Pro Gly His Thr Hia Ser 
1205 1210 1215 

Gly ser Asp Ser Ser Asp Ser Glu Tyr Ser Ser Gin Thr Thr Val Ser 
1220 1225 1230 

Gly Leu ser Glu Glu Leu Arg Hia Tyr Glu Ala Gin Gin Gly Ala Gly 
1235 1240 1245 

Gly Pro Ala Hie Gin Val He Val Clu Ala Thr Glu Aan Pro Val Phe 
1250 1255 1260 

Ala Hia Ser Thr Val Val Hia Pro Glu Ser Arg His Hia Pro Pr Ser 
1265. 1270 1275 1280 
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Asn Pro Arg Gin Gin Pro Hio Lau Asp Ser Gly S r Leu Pr Pro Gly 
1285 1290 1295 

Arg Gin Gly Gin Gin Pro Arg Arg Asp Pro Pro Arg Glu Gly Leu Trp 
1300 1305 1310 

Pro Pro Leu Tyr Arg Pro Arg Arg Asp Ala Phe Glu lie Ser Thr Glu 
1315 1320 1325 

Gly His Ser Gly Pro Ser Asn Arg Ala Arg Trp Gly Pro Arg Gly Ala 
1330 1335 1340 

Arg Ser His Asn Pro Arg Asn Pro Ala Ser Thr Ala Met Gly Ser Ser 
1345 1350 1355 1360 

Val Pro Gly Tyr Cys Gin Pro lie Thr Thr Val Thr Ala Ser Ala Ser 
1365 1370 1375 

Val Thr Val Ala Val His Pro Pro Pro Val Pro Gly Pro Gly Arg Asn 
1380 1385 1390 

Pro Arg Gly Gly Leu Cys Pro Gly Tyr Pro Glu Thr Asp His Gly Leu 
1395 1400 1405 

Phe Glu Asp Pro His Val Pro Phe His Val Arg Cys Glu Arg Arg Asp 
1410 1415 1420 

Ser Lys Val Glu Val He Glu Leu Gin Asp Val Glu Cys Glu Glu Arg 
1425 1430 1435 1440 

Pro Arg Gly Ser Ser Ser Asn 
1445 
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WFAT 19 PT ATMFD TS: 

1 . A DNA sequence other than present in a chromosome encoding a patched gene 
other than the Drosophila patched gene or firagment thereof of at least about 12bp 

5 different from the sequence of the Drosophila patched gene. 

2. A DNA sequence according to Claim 1, wherein said patched gene is a 
mammalian gene. 

10 3. A DNA sequence according to Claim 1 for human, mouse, mosquito, butterfly 
or beetle patched gene, 

4. A DNA sequence according to Claim 3, wherein said DNA sequence is a 
human sequence. 

15 

5. A DNA sequence according to Claim 4, wherein said DNA sequence is a 
mouse sequence. 

6. A DNA sequence according to Claim 1 , wherein said DNA sequence is a 
20 fragment of at least about 18bp. 

7. A DNA sequence according to Claim 1 joined to a DNA sequence comprising 
a restriction enzyme recognition sequence. 

25 8. An expression cassette comprising a transcriptional initiation r^on functional 
in an expression host, a DNA sequence according to Claim 1 under the 
transcriptional regulation of said transcriptional initiation region, and a 
transcriptional termination region functional in said expression host. 

30 9. An expression cassette according to Claim 8, wherein said transcriptional 
initiation region is heterologous to said DNA sequence according to Claim 1. 
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10. An expression cassette according to Claim 8, wherein said transcriptional 
initiation region is homologous to said DNA sequence according to Claim 1 and 
includes the enhancer regi n. 

5 11. A cell comprising an expression cassette according to Claim 8 as part of an 
extrachromosomal element or integrated into the genome of a host cell as a result of 
introduction of said expression cassette into said host ceU and the cellular progeny of 
said host cell. 

10 12. A ceU according to Claim 1 1, further comprising die patched protein in the 
cellular membrane of said cell. 

13. A cell according to Claim 1 1 , wherein said patched protein is a mouse patched 
protein. 

15 

14. A ceU according to Claim 1 1 , wherein said patched gene is a human patched 
protein. 



15. A cell according to Claim 1 1 , wherein said transcriptional initiation region is a 
20 Drosophila patched gene ttanscriptional initiation region comprising the promoter 

and enhancer joined to a heterologous gene. 

16. A cell comprising an expression cassette comprising a transcriptional initiation 
r^on functional in an expression host, said transcriptional initiation region 

25 consisting of a 5' non-coding region regulating the transcription of patdted protein 
comprising the promoter and enhancer, a marker gene, and a transcriptional 
termination region, as part of an extrachromosomal element or integrated into tiie 
genome of a host cell as a result of introduction of said expression cassette into said 
host, and the cellular progeny thereof. 
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17. A ceU according to Claim 16, wherein said transcriptional initiation region is 
the Drosophila region. 

18. A method for following embryonic development employing the patched 
5 protein in an embryo, said method comprising: 

integrating an expression cassette comprising a transcriptional initiation region 
functional in embryonic host cells, said transcriptional initiation region consisting of 
a 5' non-coding region regulating the transcription of patched protein, a marker 
gene, and a transcriptional termination region, wherein said embryonic host cells are 

10 capable of developing into a fetus; 

growing said embryonic host cells, whereby proliferation and differentiation 

occur; and 

locating ceUs comprising expression of the patched protein by means of 
expression of said marker gene. 



15 



19. A method for producing patched protein, said method comprising: 
growing a ceU according to Claim 1 1 , whereby said patched protein is 

expressed; and 

isolating said patched protein free of other proteins. 



20 



20. A method for screening candidate compounds for binding affinity to the 
patched protein, said method compriang: 

combining said candidate protein with a vertebrate or invertebrate ceU 
comprising said patOied protein in the membrane of said cell and an expression 
25 cassette comprising a transcriptional initiation region functional in said ceU, a DNA 
sequence according to Claim 1 comprising the entire coding sequence under the 
transcriptional regulation of said transcriptional initiation region, and a 
transcriptional termination region functional in said cell, expressing patched 

protein in said cell; and 
30 assaying for the binding of said candidate compound to said patched protein. 
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21 . A method for screwing candidate compounds for agonist activity with the 
patched protein, said method comprising: 

combining said candidate protein with a vertebrate or invertebrate cell 
comprising said patched protein in the membrane of said cell and an expression 

5 cassette comprising a transcriptional initiation region functional in an expression 
host, said transcriptional initiation region consisting of a 5' non-coding region 
regulating the transcription of patched protein, a marker gene, and a transcriptional 
termination region, as part of an extrachromosomal element or integrated into the 
genome of a host cell; and 

10 assaying for the expression of said marker gene. 

22. A monoclonal antibody binding specifically to a patched protein, other than 
the Drosophila patched protein. 
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The inventions listed as Groups I-V do not relate to a single inventive concept under PCT Rule 13.1, because, under 
PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons: 

Groups I. 11 and V are not so linked as to form a single invenuvc concept because the DNA sequence of Group I. the 
ccU of Group II and the monoclonal anUbody of Group V arc drawn to three different products. 

Groups 11 and HI arc not so linked as to form a single invenUve concept because they are drawn to matcriaUy different 
methods. The method of Group U involves growing a ccU ^Mic the method of Group III mvolves combining a 
candidate compound with a cell and then assaying for binding. 

Groups II and IV are not so linked as to form a single inventive concept because they are drawn to materiaUy different 
methods. The method of Group U involves growing a ceU while Ihc method of Group IV involves combmmg a 
candidate compound with a cell and then assaying for expression of a marker gene. 
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